LSSTDESC / CCL

DESC Core Cosmology Library: cosmology routines with validated numerical accuracy
BSD 3-Clause "New" or "Revised" License
145 stars 64 forks source link

GSL integration errors in angular_cl #1129

Closed paulrogozenski closed 7 months ago

paulrogozenski commented 11 months ago

TJPCov calls CCL function angular_cl to generate analytic covariance matrices. We're finding that we run into gsl integration errors in 3x2pt analyses and when increasing the l-binning scheme when calling angular_cl. In particular:

gsl: qag.c:247: ERROR: roundoff error prevents tolerance from being achieved Default GSL error handler invoked. gsl: qag.c:247: ERROR: roundoff error prevents tolerance from being achieved Aborted (core dumped)

elisachisari commented 11 months ago

Try increasing the CCL precision settings: https://ccl.readthedocs.io/en/latest/source/notation_and_other_cosmological_conventions.html That should help removing this error but it will take longer to compute

tilmantroester commented 11 months ago

The integration routines are pretty brittle. E.g. #1133 and many other mentions of GSL errors. We should definitely fix this. Does this cause a python exception or does it crash? And do you have a MWE that reproduces this error?

aguinot commented 10 months ago

Hello,

Sorry to jump in.. (I can create a separate issue if more appropriate) I have encounter a similar error while calling angular_cl. I join a code to reproduce the error. I tried to change pyccl.gsl_params["INTEGRATION_EPSREL"] as suggested by @elisachisari but the error remains. I also tried to enable/disable the spline for the integral but it has no effect. The error raise for a particular combination of parameters. In the exemple if you set b_ia=0 it works or if you change the value of Ob or Om. (The value of the cosmological parameters are not "classic" because they come from a MCMC sampling)

Edit: I forgot to mentioned, this error kills the python kernel so it not possible to grab the error within a python Exception.

I use CCL version 3.0.0 on a linux machine.

import numpy as np
import pyccl as ccl

def get_model(
    theta,
    z,
    nz,
    z_clust,
    nz_clust,
    Omega_c,
    h,
    Omega_b,
    sig8,
    ns,
    b_ia,
    b_g,
):

    theta_deg = theta/60

    #ccl.gsl_params["INTEGRATION_EPSREL"] = 1e-10
    ccl.gsl_params.LENSING_KERNEL_SPLINE_INTEGRATION = False
    cosmo = ccl.Cosmology(
        Omega_c=Omega_c,
        Omega_b=Omega_b,
        h=h,
        n_s=ns,
        sigma8=sig8,

        transfer_function='boltzmann_camb',
        matter_power_spectrum='camb',
        extra_parameters={
            "camb": {
                "halofit_version": 'mead2016',  # 'mead2020',
                "kmax": 21,
                # We do this to avoid warnings in CAMB with (kmax < 5 or
                # 'kmax < 20 and z > 4') (set internally in ccl)
                # default: kmax=10
            }
        }
    )

    lens = ccl.WeakLensingTracer(
        cosmo,
        dndz=(z, nz),
        ia_bias=(z, np.ones_like(z)*b_ia),
        use_A_ia=True,
    )
    count = ccl.NumberCountsTracer(
        cosmo,
        dndz=(z_clust, nz_clust),
        bias=(z_clust, np.ones_like(z_clust)*b_g),
        has_rsd=False,
    )

    l_min = 10
    l_max = 20_000
    l_bin = 500
    ell = np.logspace(np.log10(l_min), np.log10(l_max), l_bin)
    #cl_ss = ccl.angular_cl(cosmo, lens, lens, ell)
    cl_ggl = ccl.angular_cl(cosmo, count, lens, ell)

theta = np.array([
    0.59483, 0.81225, 1.1084, 1.5126, 2.064, 2.8164, 3.8429,
    5.2435, 7.1545, 9.7618, 13.32, 18.173, 24.796, 33.832,
    46.158, 62.979, 85.925,   117.22, 159.93, 218.18
])

z = np.array([
    0.10072152, 0.12760372, 0.15448592, 0.18136812, 0.20825032,
    0.23513252, 0.26201472, 0.28889692, 0.31577912, 0.34266132,
    0.36954352, 0.39642572, 0.42330792, 0.45019012, 0.47707232,
    0.50395452, 0.53083672, 0.55771892, 0.58460112, 0.61148332,
    0.63836552, 0.66524772, 0.69212992, 0.71901212, 0.74589432,
    0.77277652, 0.79965872, 0.82654092, 0.85342312, 0.88030532,
    0.90718752, 0.93406972, 0.96095192, 0.98783412, 1.01471632,
    1.04159852, 1.06848072, 1.09536292, 1.12224512, 1.14912732,
    1.17600952, 1.20289172, 1.22977392, 1.25665612, 1.28353832,
    1.31042052, 1.33730272, 1.36418492, 1.39106712
])
nz = np.array([
    0.12076223, 0.22668162, 0.32414372, 0.40177446, 0.5012348 ,
    0.6374815 , 0.73624094, 0.85682407, 0.97377046, 1.01942278,
    1.08607999, 1.16231719, 1.25094468, 1.28847646, 1.23476165,
    1.21876199, 1.24055086, 1.25808819, 1.30878668, 1.31812902,
    1.27329675, 1.26612347, 1.25315893, 1.20676428, 1.16027188,
    1.09777456, 1.05239238, 1.00567894, 0.96123482, 0.91041423,
    0.82610762, 0.74956655, 0.69118441, 0.63576786, 0.5788433 ,
    0.52193782, 0.48064428, 0.45261756, 0.41313699, 0.37537202,
    0.340209  , 0.29831584, 0.26759665, 0.24262065, 0.21607769,
    0.19149949, 0.1694706 , 0.15380648, 0.14124105
])

z_clust = np.array([
    0.0809284, 0.0968652, 0.112802 , 0.1287388, 0.1446756, 0.1606124,
    0.1765492, 0.192486 , 0.2084228, 0.2243596, 0.2402964, 0.2562332,
    0.27217  , 0.2881068, 0.3040436, 0.3199804, 0.3359172, 0.351854 ,
    0.3677908, 0.3837276, 0.3996644, 0.4156012, 0.431538 , 0.4474748,
    0.4634116, 0.4793484, 0.4952852, 0.511222 , 0.5271588, 0.5430956,
    0.5590324, 0.5749692, 0.590906 , 0.6068428, 0.6227796, 0.6387164,
    0.6546532, 0.67059  , 0.6865268, 0.7024636, 0.7184004, 0.7343372,
    0.750274 , 0.7662108, 0.7821476, 0.7980844, 0.8140212, 0.829958 ,
    0.8458948, 0.8618316
])
nz_clust = np.array([
    0.4349857 , 0.7626042 , 0.77745737, 1.05118007, 0.94890539,
    0.81819749, 0.7821255 , 0.6229844 , 0.70106963, 1.16406417,
    1.22093059, 1.38346671, 0.93490097, 1.09107144, 1.43184561,
    1.31259587, 1.23917877, 1.16024478, 1.58080168, 1.45858131,
    1.13902597, 0.97479234, 1.44924503, 2.32982584, 2.6455618 ,
    3.04362676, 3.0482949 , 3.44041859, 3.4026491 , 3.11364885,
    2.54625775, 2.38796539, 2.36122968, 1.94873593, 1.77898541,
    1.39662237, 1.08173517, 0.96715357, 0.78297426, 0.53810914,
    0.44135134, 0.33525727, 0.26565956, 0.18672557, 0.10906471,
    0.06492957, 0.04625702, 0.02716008, 0.01103378, 0.00636564
])

Oc = 0.08095142
sig8 = 0.8
h = 0.70794897
Ob = 0.04
ns = 0.95
b_ia = 0.55932604
b_g=1.2

model = get_model(
    theta,
    z, nz,
    z_clust, nz_clust,
    Omega_c=Oc,
    h=h,
    Omega_b=Ob,
    sig8=sig8,
    ns=ns,
    b_ia=b_ia,
    b_g=b_g,
)
paulrogozenski commented 10 months ago

I've attached a minimal failing example that produces the error in CCL. By adding noise to the n(z) distribution, CCL runs into GSL integration issues. minimal_failing_example_gsl.zip

chrgeorgiou commented 8 months ago

A quick workaround for this is to change the limber_integration_method in the angular_cl function to 'spline'. This worked for my case but I don't know if it will be slower and/or any less accurate.

davidesciotti commented 8 months ago

It worked for me too :ok_hand:

tilmantroester commented 8 months ago

I think there are cases where the spline integration fails as well.

aguinot commented 8 months ago

I think the fact that the integration fails is not the biggest problem (for me), one can always investigate to understand why it failed. The problem is that it kills the python kernel. That is particularly problematic if, for example, you are running a MCMC and it crashes in the middle. If it is possible to return a python error that can be caught that would be ideal!

tilmantroester commented 8 months ago

Killing the python interpreter is absolutely not intended behaviour. Do you have a minimal example that reproduces this issue? I think @marcpaterno and @vitenti have mentioned these crashes as well.

The integrals have always been a bit iffy but them killing the python kernels seems like a new effect to me. Maybe something changed in GSL?

aguinot commented 8 months ago

The example here crashes my python kernel with the following message:

gsl: qag.c:247: ERROR: roundoff error prevents tolerance from being achieved
Default GSL error handler invoked.
[1]    26328 abort (core dumped)  python3 -c 'import IPython; IPython.terminal.ipapp.launch_new_instance()'

I traced back this error and it looks like this error is supposed to be caught already in the C layer (here).

Setup:

tilmantroester commented 8 months ago

My suspicion is that the GSL error handler is never turned off (which causes the aborts) because this line is never called: https://github.com/LSSTDESC/CCL/blob/master/src/ccl_core.c#L386 @nikfilippas @damonge Did CCL stop calling ccl_parameters_create during the CCLObject rewrite?

vitenti commented 7 months ago

That's exactly the problem, I manually disabled the GSL error handler and got the old behavior back. I just don't know why that call is in ccl_parameters_create, I traced this function through the latest CCL and found call to this function in src/ccl_halofit.c and src/ccl_musigma.c. I suppose that this call to disable the GSL internal error handler error should take place at the cosmology initialization.

tilmantroester commented 7 months ago

This should hopefully been fixed with #1155. If not, please reopen.