Analytical representations of the Sobol indices

takafusui commented 3 years ago

Dear Jonathan,

(sorry in advance if I duplicate a similar issue...)

I am using chaospy to quantify parametric uncertainty mainly based on the Sobol indices. I construct a surrogate model based on the polynomial chaos expansions (: PCE), and here I use chaospy library.

In chaospy, you compute the first-order Sobol indices using the Sens_m method and the total Sobol indices using Sens_t methods. When I look at these methods, you compute each index based on, for instance, Eq. (7) and (8) in Tennoe et al. (2018) respectively.

I also know that when a surrogate model is constructed based on PCE, we can represent the Sobol indices analytically, for instance, see Eq. (28) and (30) in Deman et al. (2016).

I try to implement these representations by myself, but I am confusing how we can define alpha set in Deman et al. (2016) using chaospy methods. I know that for instance in #120, you discussed a similar topic, but you seem to change the name of some APIs, I cannot follow the discussion. Could you kindly help me and teach me how I can define alpha set, for instance in Eq. (28) of Deman et al. (2016) in the framework of chaospy? I also guess it is nice for chaospy to equip these analytical representations of the Sobol indices in addition to the original Sens_m and Sens_t.

Thank you in advance for your time and considerations.

jonathf commented 3 years ago

I don't have access to Uni right now, so I can not take a look at Deman. But if we use #120 as our baseline, I think I follow. You are talking about chaospy.bertran.bindex, right?

Yes, the API has changed. For chaospy<4 you can still use chaospy.bindex, but the new function that you really should be switching to is chaospy.glexindex. It is basically the same, but the call signature has change a little.

Hope this helps.

takafusui commented 3 years ago

Thank you for your comments. I find an arxiv version here. Please see Eq. (28) and (30).

Yes, you rename chaospy.glexindex and it will do a job. How can I construct alpha set in Eq. (28) using chaospy.glexindex? Any hint or example?

Thank you in advance.

takafusui commented 3 years ago

I set up a simple Ishigami function and compute the Sobol' indices using chaospy. My object is to compute the Sobol' indices analytically following Deman et al. (2016) and compare the numbers with the built-in cp.Sens_m and cp_Sens_t methods...

import numpy as np
import chaospy as cp

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), 3)
poly_order = 5
# polynomials
phi = cp.expansion.stieltjes(
    poly_order, dist, graded=True, reverse=True, cross_truncation=1.0)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model = cp.fit_regression(phi, samples, evals, retall=1)

print(r"The 1st order Sobol indices")
print(cp.Sens_m(approx_model[0], dist))

print(r"Total Sobol indices")
print(cp.Sens_t(approx_model[0], dist))

# Following Eq. (28) and (30) in Deman et al. (2016), we compute the Sobol'
# indices analytically.

# alpha = cp.glexindex(...)

# Analytical Sobol indices based on PCE
# Sens_m_analytic = ...
# Sens_t_analytic = ...

jonathf commented 3 years ago

For your y_alpha values, you need to get the Fourier coefficients. This can be done with this modification:

phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, graded=True, reverse=True, cross_truncation=1.0, retall=True)

This can then be used to create your D_hat using equation (27):

D_hat = numpy.sum(coeffs[1:]**2, axis=0)

Here A \setminus {0} is simplified as 0 is the first element in the expansion.

For the multi-index set A, it is imporant to generate them with the exact same signature as the expansion:

alpha = cp.glexindex(poly_order, graded=True, reverse=True, cross_truncation=1.0).T

(I'm transposing to make the code nicer below.)

The paper uses notion of sub-set, but for practical application, you are more interested in indices that that corresponds to the set. I therefore use index instead of alpha bellow:

Which gives us for (28):

for i in range(dim):
    index = (alpha[i] > 0) & (alpha.sum(0) == alpha[i])
    S_hat_i = numpy.sum(coeffs[index]**2, axis=0)/D_hat

For (29):

for i in range(dim):
    for j in range(dim):
        index = (alpha[i] > 0) & (alpha[j] > 0) & (alpha.sum(0) == alpha[i]+alpha[j])
        S_hat_ij = numpy.sum(coeffs[index]**2, axis=0)/D_hat

For (30):

for i in range(dim):
    index = alpha[i] > 0
    ST_hat_i = numpy.sum(coeffs[index]**2, axis=0)/D_hat

I haven't tested the code, so this might invalid syntax. But it should be enough for you to get where you want to go I think.

takafusui commented 3 years ago

Hi,

thank you very much for your code. It really helps me a lot. I am still confusing with the cp.glexindex method. Below I migrated your hints into the simple Ishigami function example above:

import numpy as np
import chaospy as cp

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), 3)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, graded=True, reverse=True, cross_truncation=1.0,
    retall=True)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)
# fourier_coeffs.shape = (56, )

D_hat = np.sum(fourier_coeffs[1:]**2, axis=0)

alpha = cp.glexindex(poly_order, graded=True, reverse=True, cross_truncation=1.0).T
# alpha.shape = (1, 5)

for i in range(poly_order):
    index = (alpha[i] > 0) & (alpha.sum(0) == alpha[i])
    S_hat_i = np.sum(fourier_coeffs[index]**2, axis=0)/D_hat
    # Here we have an index error...

As the results of cp.fit_regression(), we get fourier_coeffs with shape (56, ). Then with cp.glexindex(), we get alpha with shape (1, 5). As a result, in the for roop, the shape of index occurs an index error when computing S_hat_i.

How can I fix this issue?

Thank you very much in advance.

jonathf commented 3 years ago

Right, I forgot to note that you need to specify the number of dimensions. The correct signature is:

alpha = cp.glexindex(poly_order, dimensions=3, graded=True, reverse=True, cross_truncation=1.0).T

takafusui commented 3 years ago

Thank you for your comments. I compute the first-order and the total Sobol indices based on PCE, but I got different numbers when using cp.Sens_m(approx_model, dist) and cp.Sens_t(approx_model, dist).

To disentangle issues, I try to understand what you explained above using the below Ishigami example again:

import numpy as np
import chaospy as cp

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, graded=True, reverse=True, retall=True,
    cross_truncation=1.0)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

D_hat = np.sum(fourier_coeffs[1:]**2, axis=0)

alpha = cp.glexindex(start=0, stop=poly_order+1, dimensions=3,
                     cross_truncation=1.0, graded=True, reverse=True).T

D = cp.Var(approx_model, dist)

In theory and based on Eq. (27) in here, D_hat is very similar to D but it is not.

Do you find something I am still missing or wrong?

Thank you in advamce.

jonathf commented 3 years ago

I see two problems here:

As you are hinting at, the retall=True you are interested in is from fit_regression not stieltjes. My mistake.
For the formulas in the paper to work it assumes orthonormal polynomial expansion. Try adding the argument normed=True to stieltjes. It should result in the two Ds to be the same.

takafusui commented 3 years ago

Thank you for your comments. Still D_hat is different from cp.Var...

Also, I do not see any reasons to add the argument normed=True to cp.expansion.stieltjes...

I have studied the other paper here, which is written by the same group, and tried to compute cp.E as well as cp.Var based on Eq. (28) and (29) in the paper respectively. Please see the below example:

import numpy as np
import chaospy as cp

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi = cp.expansion.stieltjes(
    poly_order, dist, graded=True, reverse=True, cross_truncation=1.0)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

alpha = cp.glexindex(start=0, stop=poly_order+1, dimensions=3,
                     cross_truncation=1.0, graded=True, reverse=True).T

D_hat_u = np.empty(N_param)
for QoI_idx in range(N_param):
    index = (alpha[QoI_idx, :] != 0)
    D_hat_u[QoI_idx] = np.sum(fourier_coeffs[index]**2)

D_hat = np.sum(D_hat_u)  # 1.271
D_hat2 = np.sum(fourier_coeffs[1:]**2, axis=0)  # 1.205
D = cp.Var(approx_model, dist)  # 12.377

E_hat = fourier_coeffs[0]  # 3.404
E = cp.E(approx_model, dist)  # #3.404

I noticed that I got the same numbers in E_hat and cp.E()!, but not for variance. Given this my guess is that we can correctly compute y_hat in the paper (or our fourier_coeffs in the code), but maybe alpha set is different, from which all errors stem...

How do you think?

Thank you in advance...

jonathf commented 3 years ago

On page 15 after (16): "... multivariate polynomials that are orthonormal ...". This implies that not only do we assume that E[phi_i phi_j] = 0 but also that E[phi_i^2] = 1. It is a normalization that simplifies the formulas a bit throughout the paper. You could do without, but (28), (29) and (30) would not look the same.

Anyways, are you able to reproduce this:

import numpy as np
import chaospy as cp

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, normed=True, graded=True,
    reverse=True, retall=True, cross_truncation=1.0,
)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

# From operators:
print(cp.E(approx_model, dist))                 # 3.4994745629924666
print(cp.Var(approx_model, dist))               # 11.135117199948118

# From coefficients:
print(fourier_coeffs[0])                        # 3.4994745629924684
print(np.sum(fourier_coeffs[1:]**2, axis=0))    # 11.135117199948128

jonathf commented 3 years ago

As a side note: the formula for the mean does not vary depending on orthonormality, only the variance formula chang. That is why the values correspond in your case.

takafusui commented 3 years ago

Hi,

thank you very much for your comments. Strange enough, I cannot reproduce your results on my computer, even I copy and paste from your code.

With the below, I have different numbers:

import numpy as np
import chaospy as cp

print(np.__version__)  # 1.21.0
print(cp.__version__)  # 4.3.2

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, normed=True, graded=True,
    reverse=True, retall=True, cross_truncation=1.0,
)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

# From operators:
print(cp.E(approx_model, dist))                 # 4.440892098500626e-16
print(cp.Var(approx_model, dist))               # 6.997560686366695

# From coefficients:
print(fourier_coeffs[0])                        # -0.009803732512400897
print(np.sum(fourier_coeffs[1:]**2, axis=0))    # 0.8090453094082214

Could you please copy and paste this code and run it on your computer? Maybe version conflicts or something else? Thank you.

jonathf commented 3 years ago

I still get the matching values on my machine.

Do you mind posting the following info?

OS (and distro)
Python version
import numpoly; print(numpoly.__version__)

takafusui commented 3 years ago

That sounds strange... Here it is:

Ubuntu 20.04
python 3.8.8
numpoly.__version__: 1.2.3

jonathf commented 3 years ago

Hm, there is a bug in numpoly version 1.2.3 it seems. Can you downgrading to 1.2.2 and try again?

takafusui commented 3 years ago

Yes it is! When I downgrade numpoly to 1.2.2, I got the same numbers! Thank you very much, now I will look at the Sobol' indices based on the Fourier coefficients.

jonathf commented 3 years ago

Good.

I've made a bugfix in chaospy version 4.3.3 that works around the bug in numpoly. I'll see if I can diagnose the numpoly error later.

takafusui commented 3 years ago

Hi,

thank you again for your very patient comments on my issue. With the below script, I can compute the same total and the first-order Sobol indices using the Fourier coefficients:

import numpy as np
import chaospy as cp

print(np.__version__)  # 1.21.0
print(cp.__version__)  # 4.3.2
import numpoly; print(numpoly.__version__)  # 1.2.2

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

# import ipdb; ipdb.set_trace()
N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, normed=True, graded=True, reverse=True, retall=True,
    cross_truncation=1.0)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

# From operators:
print(r'Expectation and variance using the Chaospy built-in functions')
print(cp.E(approx_model, dist))
print(cp.Var(approx_model, dist))

# Total variance using the Fourier coefficients
D_hat = np.sum(fourier_coeffs[1:]**2, axis=0)

print(r'Expectation and variance using the Fourier coefficients')
print(fourier_coeffs[0])
print(D_hat)

alpha = cp.glexindex(
    start=0, stop=poly_order+1, dimensions=3, cross_truncation=1.0,
    graded=True, reverse=True).T

# Total Sobol' indices based on PCE
Sens_t_hat = np.empty(N_param)
for idx in range(N_param):
    index = (alpha[idx, :] > 0)
    Sens_t_hat[idx] = np.sum(fourier_coeffs[index]**2, axis=0) / D_hat

print(r'Total Sobol indices')
print(Sens_t_hat)
print(r'cp.Sens_t: {}'.format(cp.Sens_t(approx_model, dist)))

# First-order Sobol' indices based on PCE
Sens_m_hat = np.empty(N_param)
for idx in range(N_param):
    index = (alpha[idx, :] > 0) & (alpha.sum(0) == alpha[idx, :])
    Sens_m_hat[idx] = np.sum(fourier_coeffs[index]**2, axis=0) / D_hat

print(r'First-order Sobol indices')
print(Sens_m_hat)
print(r'cp.Sens_m: {}'.format(cp.Sens_m(approx_model, dist)))

Sens_m2_hat = np.empty([N_param, N_param])
for idx in range(N_param):
    for jdx in range(N_param):
        index = (idx != jdx) & (alpha[idx, :] > 0) & (alpha[jdx, :] > 0) \
            & (alpha.sum(0) == alpha[idx, :]+alpha[jdx, :])
        Sens_m2_hat[idx, jdx] = np.sum(
            fourier_coeffs[index]**2, axis=0) / D_hat

print(r'Second-order Sobol indices')
print(Sens_m2_hat)
print(r'cp.Sens_m2: {}'.format(cp.Sens_m2(approx_model, dist)))

takafusui commented 3 years ago

Hopefully my last question...

I am interested in the so-called univariate effects, which are defined in Eq. (32) in this paper, based on PCE.

Does the below implementation make sense to you or do you already implement something more efficient to do this job?

# Univariate effect
N_linspace = 250
M1_linspace = np.linspace(
    [-np.pi, -np.pi, -np.pi], [np.pi, np.pi, np.pi], N_linspace)

M1 = np.empty([N_param, N_linspace])
for idx in range(N_param):
    index = (alpha[idx, :] > 0) & (alpha.sum(0) == alpha[idx, :])
    M1[idx, :] = np.sum(
        # (fourier_coeffs[0] + fourier_coeffs[index] * phi[index])(M1_linspace[:, idx]), axis=0)
        (fourier_coeffs[index] * phi[index])(M1_linspace[:, idx]), axis=0)

# Plot with matplotlib
import matplotlib.pyplot as plt
idx = 0
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_0$')
idx = 1
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_1$')
idx = 2
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_2$')
plt.xlim([-np.pi, np.pi])
plt.xlabel(r'$\theta_i$')
plt.ylabel(r'$M^1_i$')
plt.legend(loc='lower left')
plt.show()

I appreciate any comments.

jonathf commented 3 years ago

I imagine you need to do something with your evaluation here.

The evaluation needs to be specific to a dimension like: phi(q0=val), phi(q1=val), ..., but you need to do that using a index variable idx. So you need something like phi(**{f"q{idx}": val}).

Even though each phi should be 1D, the software does not know that. So I believe you need to manually convert contant polynomials to numpy arrays using .tonumpy().

So in your example, I'd do:

    evals = (fourier_coeffs*phi)[index](**{f"q{idx}": M1_linspace[:, idx]).tonumpy()
    M1[idx, :] = np.sum(evals.tonumpy(), axis=0)

takafusui commented 3 years ago

Hi,

thank you very much for your continuous help. Here is the complete code for the future reference:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
import numpy as np
import chaospy as cp

print(np.__version__)  # 1.21.0
print(cp.__version__)  # 4.3.2
import numpoly; print(numpoly.__version__)  # 1.2.2

def ishigami(x, a=7., b=0.1):
    """Ishigami function."""
    _f = np.sin(x[0]) + a * np.sin(x[1])**2 + b * x[2]**4 * np.sin(x[0])
    return _f

N_param = 3  # Number of uncertain parameters
# Joint distribution
dist = cp.Iid(cp.Uniform(-np.pi, np.pi), N_param)
poly_order = 5
# polynomials
phi, coeffs = cp.expansion.stieltjes(
    poly_order, dist, normed=True, graded=True, reverse=True, retall=True,
    cross_truncation=1.0)

# Collocation method
samples = dist.sample(1024)
evals = [ishigami(sample) for sample in samples.T]

# Return polynomials and Fourier coefficients
approx_model, fourier_coeffs = cp.fit_regression(phi, samples, evals, retall=1)

# From operators:
print(r'Expectation and variance using the Chaospy built-in functions')
print(cp.E(approx_model, dist))
print(cp.Var(approx_model, dist))

# Total variance using the Fourier coefficients
D_hat = np.sum(fourier_coeffs[1:]**2, axis=0)

print(r'Expectation and variance using the Fourier coefficients')
print(fourier_coeffs[0])
print(D_hat)

alpha = cp.glexindex(
    start=0, stop=poly_order+1, dimensions=3, cross_truncation=1.0,
    graded=True, reverse=True).T

# Total Sobol' indices based on PCE
Sens_t_hat = np.empty(N_param)
for idx in range(N_param):
    index = (alpha[idx, :] > 0)
    Sens_t_hat[idx] = np.sum(fourier_coeffs[index]**2, axis=0) / D_hat

print(r'Total Sobol indices')
print(Sens_t_hat)
print(r'cp.Sens_t: {}'.format(cp.Sens_t(approx_model, dist)))

# First-order Sobol' indices based on PCE
Sens_m_hat = np.empty(N_param)
for idx in range(N_param):
    index = (alpha[idx, :] > 0) & (alpha.sum(0) == alpha[idx, :])
    Sens_m_hat[idx] = np.sum(fourier_coeffs[index]**2, axis=0) / D_hat

print(r'First-order Sobol indices')
print(Sens_m_hat)
print(r'cp.Sens_m: {}'.format(cp.Sens_m(approx_model, dist)))

Sens_m2_hat = np.empty([N_param, N_param])
for idx in range(N_param):
    for jdx in range(N_param):
        index = (idx != jdx) & (alpha[idx, :] > 0) & (alpha[jdx, :] > 0) \
            & (alpha.sum(0) == alpha[idx, :]+alpha[jdx, :])
        Sens_m2_hat[idx, jdx] = np.sum(
            fourier_coeffs[index]**2, axis=0) / D_hat

print(r'Second-order Sobol indices')
print(Sens_m2_hat)
print(r'cp.Sens_m2: {}'.format(cp.Sens_m2(approx_model, dist)))

# Univariate effect
N_linspace = 250
M1_linspace = np.linspace(
    [-np.pi, -np.pi, -np.pi], [np.pi, np.pi, np.pi], N_linspace)

M1 = np.empty([N_param, N_linspace])
for idx in range(N_param):
    index = (alpha[idx, :] > 0) & (alpha.sum(0) == alpha[idx, :])
    evals = (fourier_coeffs * phi)[index](**{f"q{idx}": M1_linspace[:, idx]})
    M1[idx, :] = np.sum(evals, axis=0)

# Plot with matplotlib
import matplotlib.pyplot as plt
idx = 0
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_0$')
idx = 1
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_1$')
idx = 2
plt.plot(M1_linspace[:, idx], M1[idx, :], label=r'$M^1_2$')
plt.xlim([-np.pi, np.pi])
plt.xlabel(r'$\theta_i$')
plt.ylabel(r'$M^1_i$')
plt.legend(loc='lower left')
plt.show()

If I have another issue, I will create a new issue, but meanwhile, I would like to close this issue. I appreciate again all your kind comments!

jonathf / chaospy

Analytical representations of the Sobol indices #350