SALib / SALib

Sensitivity Analysis Library in Python. Contains Sobol, Morris, FAST, and other methods.
http://SALib.github.io/SALib/
MIT License
891 stars 238 forks source link

Quantifying the impact of "outliers" on SA results #516

Open mschrader15 opened 2 years ago

mschrader15 commented 2 years ago

Hi All,

I am using this issue like a forum question:

Does anyone know of published research that deals with selecting the bounds for the sensitivity analysis, and especially how model evaluation near the bounds can lead to erroneous results?

My application is traffic simulation. When I generate Sobol sequences for input parameters using the ranges given by prior papers, I find that certain combinations of parameters near the bounds (one at highest value, one at lowest) can cause the traffic flow to break down, making it much different than 99% of the other simulations and failing very basic calibration.

The impact is that these 1% simulations cause a long right tail in the overall distribution of my model outputs and cause the Sobol indices to "over-weight" the impact of the parameters that caused the breakdown. If I re-run the simulation with the sequence bounds set to more realistic values, the CI converges faster and SI's are different.

mschrader15 commented 2 years ago

Looks like #262 was a similar question

willu47 commented 2 years ago

Hi @mschrader15 - thanks for the question. I am currently looking into a related topic regarding "bounds for the sensitivity analysis" and have found a few papers which show the importance of "properly" assessing the range over which the input parameters can vary - e.g. your uncertainty over the input - rather than merely selecting a ±10% range around a central value.

So this is like the opposite issue to the one you raise, but it is related - in both cases, conducting an exercise to try to quantify uncertainty around the inputs should lead you to a good solution. In your case, if there is a possibility that these parameter combinations which cause erroneous results can occur, then you need to take a closer look at your model to better simulate this eventuality. If the likelihood of the combination is vanishingly small, then you can exclude that from your analysis.

mschrader15 commented 2 years ago

Thanks for the synopsis @willu47! Would you mind sharing links to those papers?

mschrader15 commented 2 years ago

Referencing #315 here as well. Same issue