usedagger / usedagger.com-issues

Give feedback as a GitHub issue
https://usedagger.io/
1 stars 0 forks source link

Add measures of global sensitivity analysis [independent input distributions] #15

Open tadamcz opened 1 year ago

tadamcz commented 1 year ago

Mathematical background

I believe we can use Sobol' indices when the inputs are independently distributed.

Note that neither the first-order effects nor the total effects sum to any constant, but this is acceptable in a factor prioritisation setting.

Wikipedia

https://en.wikipedia.org/wiki/Variance-based_sensitivity_analysis

Shapley Effects for Global Sensitivity Analysis: Theory and Computation (2016):

This paper is about Shapley effects, but I am quoting the introductory parts about Sobol' indices:

In the case of factor prioritization, the goal is to identify inputs with the property that reducing their uncertainty reduces the total output variance the most. Hence, there is no reason to expect that the sum of such sensitivity measures will be equal to the total variance. ... Two of the most widely used measures are first-order and total effects suggested by Homma and Saltelli [12]. A first-order effect measures the expected reduction in variance of the output when an input is fixed to a constant, whereas the total effect measures the expected remaining variance of the output when all other input values are fixed. ... One purpose of global sensitivity analysis is to provide guidance for investing resources to mitigate the uncertainty in a model output by reducing the uncertainty in the inputs: that is, to decide which input (or set of inputs) to control or to determine more accurately to reduce the variance of the output the most. Saltelli et al. [24] refer to this problem as factor prioritization. The first-order effects are defined in a way that makes them useful for factor prioritization ... In general, Vi < Ti when inputs are independent, and they are used together to complement each other. Homma and Saltelli [12] show that with independent inputs Ti is the sum of first-order effect Vi and all the interaction effects by Xi with other inputs. Hence, Ti − Vi is a measure of how much Xi is involved in interactions. ... Although these sensitivity measures based on Sobol’ indices are widely accepted in applications, their fundamental assumption of independence among inputs limits the scope of problems to which these measures can be applied. In fact, in section 3.2 we show that the inequalities in (3) no longer hold in the case of dependent inputs; under some dependence structures, the sum of total effects is less than the sum of the first-order effects, which makes it difficult to interpret the two effects.

Global sensitivity analysis: the primer (2007)

1.2.9 A First Setting: ‘Factor Prioritization’

... This means that even for a nonadditive model [with independent inputs] we have found a way to recover (that is, to understand) 100% of the variance of Y. Thus variance-based sensitivity measures provide a theoretical framework whereby – provided one has the patience to compute all interaction terms – one can achieve a full understanding of the model’s sensitivity pattern. Patience is indeed required, as in principle a model can have interactions of even higher order ... ... We have argued in a series of works (Saltelli et al., 2004, and references therein) that a good, synthetic, though nonexhaustive characterization of the sensitivity pattern for a model with k factors is given by the total set of first-order terms plus the total effects. For a system with 10 factors this makes 20 terms rather than 1023.

Implementation

Basic

By default, show the customary pairing of first-order effect and total effect.

Advanced

A potential addition would be to allow the user to additionally request the 2^k matrix of interaction effects. This would run in a separate background task and the user would be warned it could be long-running. I think users might often have models with a large number of variables, but the variables with uncertainty would be e.g. k=5; I think k=5 or even k=10 should be manageable computationally. However, I think this is quite an advanced feature that users may not be interested in.

tadamcz commented 1 year ago

Sobol' indices added to API: https://github.com/usedagger/usedagger.io/commit/2ca377d0