jonathf / chaospy

Chaospy - Toolbox for performing uncertainty quantification.
https://chaospy.readthedocs.io/
MIT License
439 stars 87 forks source link

Error in moments with more than 10 variables #397

Closed knstmrd closed 1 year ago

knstmrd commented 1 year ago

I'm running sensitivity analysis studies and recently switched to a larger number of variables, namely 11. Started getting extremely weird results (such as mean value being outside the min/max range of my model evaluations). Managed to pinpoint this down to my use of Uniform distributions on an interval other than [0,1]. This is also dependent on where the differently distributed variable is placed.

I've attached a minimal code below that reproduces this error: it uses a dummy model that always outputs a constant value, so the mean should be the same. The number of variables is changed from 2 to 12, all of them are taken to be Uniform on the interval [0,1] and a single variable is replaced by a variable that's distributed uniformly on [-1,1].

The code loops over the number of variables and the position of this differently distributed variable and outputs the position and number of variables if the computed mean is not in agreement with the constant function output. Error occurs both for sparse and non-sparse grids.

I did some testing, and as far as the generated expansions, weights, and model evaluation points go, everything seems OK to me. Seems like something goes awry in the quadrature fitting process. Any suggestions on how to fix this (or whether I'm doing something wrong) most welcome!

Code to reproduce the error:

import chaospy
import numpy as np
import logging
logging.getLogger().setLevel(logging.CRITICAL)

def dummy_model(x):
    return 1.0

for n_vars in range(2,13):
    chaospy_distributions = [chaospy.Uniform(0., 1.0) for i in range(n_vars)]
    for place in range(0,n_vars):
        chaospy_distributions[place] = chaospy.Uniform(-1.0, 1.0)

        full_joint = chaospy.J(*chaospy_distributions)
        quad_pw = chaospy.generate_quadrature(1, full_joint, sparse=True, growth=True)
        # quad_pw = chaospy.generate_quadrature(1, full_joint, sparse=False, growth=True)
        weights = quad_pw[1]
        points = quad_pw[0]

        expansion = chaospy.generate_expansion(1, full_joint)

        model_evals = [dummy_model(points[:, i]) for i in range(weights.shape[0])]
        sparse_model_approx = chaospy.fit_quadrature(expansion, points, weights, model_evals)
        E_computed = chaospy.E(sparse_model_approx, full_joint)
        if abs(E_computed - 1.0) > 1e-4:
            print(f"Computed {E_computed} with U(-1.0,1.0) in {place}/{n_vars}")

The output I get is:

Computed 2.1785714285714293 with U(-1.0,1.0) in 2/11
Computed 2.178571428571429 with U(-1.0,1.0) in 3/11
Computed 2.178571428571429 with U(-1.0,1.0) in 4/11
Computed 2.178571428571429 with U(-1.0,1.0) in 5/11
Computed 2.178571428571429 with U(-1.0,1.0) in 6/11
Computed 2.178571428571429 with U(-1.0,1.0) in 7/11
Computed 2.178571428571429 with U(-1.0,1.0) in 8/11
Computed 2.178571428571429 with U(-1.0,1.0) in 9/11
Computed 2.178571428571429 with U(-1.0,1.0) in 2/12
Computed 3.357142857142857 with U(-1.0,1.0) in 3/12
Computed 3.3571428571428577 with U(-1.0,1.0) in 4/12
Computed 3.3571428571428577 with U(-1.0,1.0) in 5/12
Computed 3.3571428571428568 with U(-1.0,1.0) in 6/12
Computed 3.3571428571428568 with U(-1.0,1.0) in 7/12
Computed 3.3571428571428568 with U(-1.0,1.0) in 8/12
Computed 3.357142857142857 with U(-1.0,1.0) in 9/12
Computed 2.1785714285714284 with U(-1.0,1.0) in 10/12
knstmrd commented 1 year ago

So this seems to be a bug in numpoly, and specifically in the ordering of the variables of the polynomial, so when when they are passed via *args and not **kwargs, the orders are inconsistent. Inspecting parameters in the numpoly.call function for the following polynomial:

import numpoly

q0, q1, q2, q3, q4, q5, q6, q7, q8, q9, q10, q11, q12, q13, q14, q15, q16, q17, q18, q19, q20 = numpoly.variable(21)
poly1 = numpoly.polynomial([q0, q1, q2, q3, q4, q5, q6, q7, q8, q9, q10, q11, q12, q13, q14, q15, q16, q17, q18, q19, q20])
x1 = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0]
print(numpoly.call(poly1, x1))

gives the following for the parameters variable: {'q0': 0.0, 'q1': 1.0, 'q10': 2.0, 'q11': 3.0, 'q12': 4.0, 'q13': 5.0, 'q14': 6.0, 'q15': 7.0, 'q16': 8.0, 'q17': 9.0, 'q18': 10.0, 'q19': 11.0, 'q2': 12.0, 'q20': 13.0, 'q3': 14.0, 'q4': 15.0, 'q5': 16.0, 'q6': 17.0, 'q7': 18.0, 'q8': 19.0, 'q9': 20.0} and the following output: `[ 0. 1. 12. 14. 15. 16. 17. 18. 19. 20. 2. 3. 4. 5. 6. 7. 8. 9.

    1. 13.]`

so the variables are sorted by the first digit of the number of the name; which is therefore the reason for the bug once one has more than 10 variables in the PCE.

Numpoly version is 1.2.3.

Found this relevant pull-request: https://github.com/jonathf/numpoly/pull/89

jonathf commented 1 year ago

That is a good catch. Thanks for looking into the matter.

Yeah, #89 addresses the issue. Unfortunatly I completely forgot about the PR and it has grown a bit stale. But I've fixed the issue now in version 1.2.5. Please let me know if the solution works for you.

knstmrd commented 1 year ago

Thanks @jonathf for the fast response! Yes, everything works as expected now.

jonathf commented 1 year ago

Glad to hear. Closing.