LSSTDESC / CCL

DESC Core Cosmology Library: cosmology routines with validated numerical accuracy
BSD 3-Clause "New" or "Revised" License
144 stars 65 forks source link

Observable calculator mode #840

Closed damonge closed 3 years ago

damonge commented 3 years ago

The issue

It is currently hard for CCL to support arbitrary non-standard cosmologies. E.g. imagine we're told by someone in TJP that they want to explore an early Dark Energy model with neutrinos, warm dark matter and Vainshtein screening, and accounting for baryons using HMCODE. Including each of those consistently in CCL would take a lot more work than we can support. These things, however, are implemented in different versions of CAMB/CLASS etc.

The proposed solution

We enable a "calculator mode" for CCL so it can take in basic observables (distances, power spectra) from these and use them consistently to carry out calculations of more complex quantities (shear/galaxy power spectra, correlation functions, halo model stuff). This would also make CCL easier to integrate into likelihood frameworks like cobaya or cosmosis. When we've discussed this in the past, it was well received, so I'm hoping this is not too controversial.

The specifics

Here's a concrete proposal of what the new API would look like. There's a chance that none of the previous API would have to be deprecated, but we'll see.

1. Declaring a Cosmology

Although ideally we would like the CCL Cosmology object to not care about cosmological parameters when in "calculator mode", in practice many of the higher-level calculations we do (e.g. halo model) do need to know e.g. what Omega_M is. So, instead of subclassing the Cosmology class into a calculator-mode object, I propose that we just add 3 more arguments to its __init__ method, which will trigger the calculator mode:

The default value for all these arguments will be None (i.e. the current standard behavior in CCL).

2. Impact on the rest of the API

All CCL functions that currently depend on a P(k) will take an optional p_of_k_a argument (the same way that angular_cl currently does), specifying the power spectrum that should be used for that calculation. This argument can then take 3 possible values:

A quick scan of the current API tells me these functions would be affected: angular_cl, correlation_3d, correlation_multipole, correlation_3dRsd, correlation_3dRsd_avgmu, correlation_pi_sigma, sigmaR, sigmaV, sigma8, kNL, halomod_power_spectrum, halomod_Pk2D.

linear_matter_power and nonlin_matter_power will continue to return the matter power spectrum (delta_matter_x_delta_matter).

sigmaM will always use the linear matter power spectrum, since this is what is used for halo model quantities.

Thoughts? @c-d-leonard @slosar @beckermr @elisachisari @mishakb and anyone else.

beckermr commented 3 years ago

A few questions:

  1. How do we handle, if at all, non-flat cosmologies?

  2. hoh0 is hard to look at. Maybe ha or hubble_function or w/e?

damonge commented 3 years ago

Good question. My thought was that we know when it's not flat based on the input parameters, and we use the corresponding sinn(chi) where necessary (which CCL already does). We could instead add r (i.e. the angular comoving distance) as an alternative additional item of the background dictionary (and, if not provided, then CCL's sinn(chi) is used).

about hoh0, yep, definitely.

marcpaterno commented 3 years ago

Hello,

I have an implementation choice question: why use ‘dict’ as the type of each of the new argumentst, rather than making each be class (if any additional behavior other than attribute lookup might be added in the future) or a namedtuple (if it is clear that no behavior wil ever be needed)?

I think both the class and the namedtuple make more clear that each argument is required to have the specific attributes named, and not others. This helps readers of the code as well as tools such as IDEs, which can better provide name completion.

Best regards Marc

From: David Alonso notifications@github.com Reply-To: LSSTDESC/CCL reply@reply.github.com Date: Monday, December 14, 2020 at 8:31 AM To: LSSTDESC/CCL CCL@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [LSSTDESC/CCL] Observable calculator mode (#840)

The issue

It is currently hard for CCL to support arbitrary non-standard cosmologies. E.g. imagine we're told by someone in TJP that they want to explore an early Dark Energy model with neutrinos, warm dark matter and Vainshtein screening, and accounting for baryons using HMCODE. Including each of those consistently in CCL would take a lot more work than we can support. These things, however, are implemented in different versions of CAMB/CLASS etc.

The proposed solution

We enable a "calculator mode" for CCL so it can take in basic observables (distances, power spectra) from these and use them consistently to carry out calculations of more complex quantities (shear/galaxy power spectra, correlation functions, halo model stuff). This would also make CCL easier to integrate into likelihood frameworks like cobaya or cosmosis. When we've discussed this in the past, it was well received, so I'm hoping this is not too controversial.

The specifics

Here's a concrete proposal of what the new API would look like. There's a chance that none of the previous API would have to be deprecated, but we'll see.

  1. Declaring a Cosmology

Although ideally we would like the CCL Cosmology object to not care about cosmological parameters when in "calculator mode", in practice many of the higher-level calculations we do (e.g. halo model) do need to know e.g. what Omega_M is. So, instead of subclassing the Cosmology class into a calculator-mode object, I propose that we just add 3 more arguments to its init method, which will trigger the calculator mode:

The default value for all these arguments will be None (i.e. the current standard behavior in CCL).

  1. Impact on the rest of the API

All CCL functions that currently depend on a P(k) will take an optional p_of_k_a argument (the same way that angular_cl currently does), specifying the power spectrum that should be used for that calculation. This argument can then take 3 possible values:

A quick scan of the current API tells me these functions would be affected: angular_clhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_cls.py-23L15&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=CMGAZhYQUd_S5o9ImgHTiQSTzV4r7avEigzS1i-VNB8&e=, correlation_3dhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_correlations.py-23L113&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=r_KYGh2ios4F9rPOX-0KifG4OggaZazZSRU4hMTDu-A&e=, correlation_multipolehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_correlations.py-23L145&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=STRz3ZWwqPbSSi5F3EzNkExWXWl6cZroZ00FLYh6Qx4&e=, correlation_3dRsdhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_correlations.py-23L180&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=p5jUASpqKmebs4KBjzwpJCYUybVa5CdS5Otn2Q-W69Y&e=, correlation_3dRsd_avgmuhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_correlations.py-23L220&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=_zVHFYNxNxPqEYUdvZf5ZYiLn51Bwn0rvYyRV6gLcYg&e=, correlation_pi_sigmahttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_correlations.py-23L255&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=gAQmInAhQFsNlZKg2TkUSF2vuyYVbZVIY39IUmi-_B0&e=, sigmaRhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L64&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=h3GTM7ry8jaLLIQ9bqZeDrey0bcT1Ev3b80GLqSmK3o&e=, sigmaVhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L80&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=3PEU_RVa2VqghvURm5XdvQoFKQyp83A4OpkAKKjsT-g&e=, sigma8https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L97&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=eCdbue8TtxzpRq1hdGt4ZMp9C_TdatGeypTouVsG10c&e=, kNLhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L113&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=5l4kIoDKjgNvR6YQo8sGQ4nNOksIycVOZI199NyjpHY&e=, halomod_power_spectrumhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_halos_halo-5Fmodel.py-23L406&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=i57PP9rToxoaGu4V25IHaX6Vbnw91T4b_qYlCCIwQeo&e=, halomod_Pk2Dhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_halos_halo-5Fmodel.py-23L540&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=vsFAMtvkCx2wz09YnKXJMYybV77QTpwomPirlaOz0Es&e=.

linear_matter_powerhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L7&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=MHB0IGCSGuxSR-CG3TLdH7hugw9KNic9gWR0327NcZ8&e= and nonlin_matter_powerhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L23&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=FlJG1hGutL9AZ0N6OELcQhMil_im7y-ucO5aphnSuAc&e= will continue to return the matter power spectrum (delta_matter_x_delta_matter).

sigmaMhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_blob_593ed60ce7d9763590856fd253e6683ca18269f5_pyccl_power.py-23L39&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=5n4sFltrKcr_7eU4Fgr9TNuFyjyfuX_n_3bxTr279q8&e= will always use the linear matter power spectrum, since this is what is used for halo model quantities.

Thoughts? @c-d-leonardhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_c-2Dd-2Dleonard&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=keZDag139L0YeEUfbQpDOOmQXUM_f1LeFXQYcqZkWzM&e= @slosarhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_slosar&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=s25E_V4ai_7EoumFr_-W3bvcxA3g6Rt2e7m6ibtMHmo&e= @beckermrhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_beckermr&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=NTClznriDFBCEZjNGu_P3-1aZS9wcSrP-sdO1WSeMKY&e= @elisachisarihttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_elisachisari&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=dXu08U_Q6-dmG9pR1PeXPkwsiHQUMp5R1tk9hx3dxJ0&e= @mishakbhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mishakb&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=xjX9FzD638dXQm0LTQGrB2L3DDJ_lzv6asMiGyFsEjE&e= and anyone else.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_issues_840&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=OJ-W1B8Pd2EdjQEbogLe-fSDO3stF6JOUhnilnu0Fyk&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AABMJX5X4I5R5OU64DGUVLLSUYOT3ANCNFSM4U22QUPA&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=GWsTyjvEMBdlDKGXbEn8zlR5sYgywQBMfnafw5qj0jI&s=QYU8OLVXdF6q44i48aX5mDvzUmw8w9lQGArQn-AqHek&e=.

beckermr commented 3 years ago

Classes and named tuples introduce mental overheads for users (i.e. which class do I need to pass here? how do I make that class?)

They are also more difficult to serialize.

Nearly everyone using python will know how to make a dictionary quickly.

slosar commented 3 years ago

I think in general this proposal is good and well thought-out abstraction and in particular allows you to have beyond matter fields (y, etc.). I think it is ok to leave a few named parameters to have special meanings. I.e. Om is used for lensing kernel and Ok to apply sin/sinh as appropriate. I think with this we can make a fairly general "calculator". A few things:

marcpaterno commented 3 years ago

Hi again,

I will concentrate on what I see as the advantages using a class, rather than a dictionary with specified keys and types of values, because I think that is actually the better solution than the namedtuple. I am ignoring issues of syntax here, because I think they are relatively minor.

I think for the reader (and code is read more often than written), the use of a well-named class and documented class introduces less mental overhead than does use of a dictionary. The class stresses what the nature of the thing is: for example, a linear power spectrum, or a non-linear power spectrum, etc. Such a type gets described in one place. In an interactive session, the user can query the object and find this documentation; development tools will allow access to this information. Automated tools can perform name-completion on data members. The type has meaning.

Use of a dictionary object does not provide these advantages.

Best regards, Marc

From: "Matthew R. Becker" notifications@github.com Reply-To: LSSTDESC/CCL reply@reply.github.com Date: Monday, December 14, 2020 at 9:25 AM To: LSSTDESC/CCL CCL@noreply.github.com Cc: Marc F Paterno paterno@fnal.gov, Comment comment@noreply.github.com Subject: Re: [LSSTDESC/CCL] Observable calculator mode (#840)

Classes and named tuples introduce mental overheads for users (i.e. which class do I need to pass here? how do I make that class?)

They are also more difficult to serialize.

Nearly everyone using python will know how to make a dictionary quickly.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LSSTDESC_CCL_issues_840-23issuecomment-2D744514216&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=TR3SPI-yitwRFlaAgTRkRAZJ0HWIIM2FQhMQSHHGU1Y&s=fw_T0_WbmZdrSGeYPuSYR_utytQ1_jlB22Pop8ByhAo&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AABMJXZB73EEUMMBYIWIMZTSUYU5NANCNFSM4U22QUPA&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=d3p0PLQb2ItFxNVHOOIv2g&m=TR3SPI-yitwRFlaAgTRkRAZJ0HWIIM2FQhMQSHHGU1Y&s=y9PfwF_fkH_7mbRIgfb-ny8AsYdON5n7zcjgSsAHDDY&e=.

vivianmiranda commented 3 years ago

Hello David,

The “calculator_mode” is precisely the conversion we have - here at UofA - implemented on Cosmolike to be integrated w/ Cobaya Sampler to evaluate more general cosmologies (a project we called Cocoa - https://github.com/CosmoLike/cocoa). Cosmolike receives power spectrum (both linear and non-linear) and background quantities (like distances) and then perform further complex calculations like shear/clustering.

One technical aspect we faced is that functions that were super fast in the past - like H(z) that was previously an analytical function - now are splines (much slower), and depending on how deep these calls are in the code - it can make a noticeable difference in execution time.

Another important technical aspect was threading - extensions to LCDM are slower to calculate - and we don’t want to have each walker on our samples to not walk past enough their initial conditions - so OpenMP threading was required - and that required some significant changes.

It took us many months to apply all these changes - it is a significant infrastructure project (awesome - but hard!) to do that on CCL based on our experience w/ Cosmolike.

Best Regards Vivian Miranda

vivianmiranda commented 3 years ago

Hi Again,

Below, I show an example of the changes we had to perform. The weights in the weak lensing evaluations are deep inside integrals. With excessive splines evaluations, code was slowed down by O(1). To fix that, we had to expand the input of their functions call to avoid repetition of interpolation evaluation

double W_gal(double a, double nz, double chi, double hoverh0)

Here is a snippet of one integrand in clustering (Cosmolike)

double int_for_C_cl_tomo(double a, void *params) {
  double *ar = (double *)params;
  struct chis chidchi = chi_all(a);
  const double hoverh0 = hoverh0v2(a, chidchi.dchida);
  const double ell = ar[2] + 0.5;
  const double fK = f_K(chidchi.chi);
  const double k = ell / fK;

  const double res = W_gal(a, ar[0], chidchi.chi, hoverh0) *
    W_gal(a, ar[1], chidchi.chi, hoverh0) * chidchi.dchida/(fK * fK);
  return res * Pdelta(k, a);
}

There is no need to understand the detail of the entire function - but just see the effort to avoid repeated interpolation evaluations of background quantities at the cost of readability.

In summary - the calculation mode may ask for big API changes in the code to match the original CCL speed requirements (based on our experience in Cosmolike) - such changes can make CLL harder to maintain and trickier to expand. That is choice (optimized code vs readable code). When doing an extension to LCDM - the speed requirement is so severe that the code needs to be more optimal.

jablazek commented 3 years ago

@slosar : is your question about RSD corrections specific to calculator mode (e.g. how to specify and pass in arbitrary P(k,mu) or multipoles), or more generally about CCL doing RSD calculations. I ask b/c @sukhdeep2 and I have been discussing adding functionality to the correlation functions that would help with RSD.

jablazek commented 3 years ago

@damonge : I assume we should still continue with the much lighter weight general correlation function implementation (#815)

damonge commented 3 years ago

I'm really sorry for missing the discussion today, and for the delay in responding. Thanks everyone for their comments. Here are some thoughts:

@marcpaterno thanks a lot. While I see your point, I'm gonna side with @beckermr on this one. First, I actually agree that there's a mental overhead to classes for users. Second, having to write code for those classes will mean having to maintain that code, and I'd rather not have to do that for now. Nevertheless, it should be easy to allow the class option if we think it works better afterwards.

@vivianmiranda thanks a lot, this is really useful information. On the spline-vs-analytical tradeoff, I think we've gone far enough down the spline route in CCL that the current performance will probably not change much after including these modifications. That doesn't mean that CCL is as fast as it could be, of course, and knowing that this kind of stuff played a role is useful.

I didn't understand this point, could you elaborate?

Another important technical aspect was threading - extensions to LCDM are slower to calculate - and we don’t want to have each walker on our samples to not walk past enough their initial conditions - so OpenMP threading was required - and that required some significant changes.

@slosar

@jablazek

vivianmiranda commented 3 years ago

@damonge, of course, I am happy to elaborate on my point. For reference, I will quote the paragraph again.

Another important technical aspect was threading - extensions to LCDM are slower to calculate - and we don’t want to have each walker on our samples to not walk past enough their initial conditions - so OpenMP threading was required - and that required some significant changes.

Let's do this thought experiment:

Pipeline 1: total runtime for 3x2 is ~O(1) second w/ 4 cores (OpenMP) Pipeline 2: total runtime for 3x2 is ~O(10^2) seconds (no OpenMP threading available).

You use 400 cores to run a single chain on pipeline 1 and 10^4 cores to run a single chain on pipeline 2.

You then ask Emcee to give you 10M samplers.

Both pipelines one and two should take the same time to evaluate the 10M samplers in these configurations.

Given all that, my point is: I suspect the convergence tests in the chain run w/ pipeline two will provide much worse results (the standard test is basically checking if walkers walked O(>>1) their correlation length) - because walkers are just stuck on volumes in parameter space around their initial condition (because they are all super slow - no way to move past their IC location even if they want!).

Given that - if CCL is already O(20s) and if the calculator mode increase that runtime even more - that would worry me. That is also the reason I worry a lot w/ increase in runtime for extensions to LCDM.

Best Vivian

vivianmiranda commented 3 years ago

@damonge - one more point: the counter-argument to my previous message is:

"Vivian (talking to myself in the third person - I love to do that pretending I have multiple personalities to discuss physics while at home :p) - but w/ more MPI threads each walker will walk smarter because Emcee allows walkers to talk to each other" - qualitatively, that is true, but this is a fine line. Too slow - and there no way you won't get stuck around IC locations even if you walk smarter.

Best Vivian

damonge commented 3 years ago

@vivianmiranda ok, thanks. I don't think CCL is O(20s), but regardless of that, this is a good point. We should add a timing unit test to make sure performance doesn't suffer when we add this (which we should have done already).

slosar commented 3 years ago

Ok, I think we should not sidetrack this discussion into emcee performance discussion. (you have a valid point, it is well known that MCMC algorithms don't scale beyond a certain point because the computation cost per node is to take Nburnin + Nsamples/Nnodes samples and once the first term dominates the game is over.

@damonge Yes, I missed the beauty of having arbitrary fields. This makes it very general. Also, I got temporarily sidetracked into thinking this is about enabling more models, but it is primary about enabling integration with Cosmosis and Cobaya with the caching being stuck upstream. It is still worth thinking if we can do any further caching inside CCL Calculator mode. Say Pklin changes, but not expansion history? I don't think it make any difference, but you never know...

jablazek commented 3 years ago

@damonge :

For the RSD issue, after reading your comments to @slosar, I think we are talking about slightly orthogonal issues. We are looking at expanding the correlation function capability to do fast calculations including RSD contributions. Of course the velocity Pk will need to come from somewhere, but we are agnostic about that for the time being.

Regarding #815, I think the earlier consensus was that it would be good to expose all of the correlation functions (not just the 2d C_ell -> w(theta) one) to a user specified (k,Pk), rather than being limited to the cosmology object's internally defined Pmatter. The modifications for implementing calculator mode may do this anyway, but I think this is a capability that should be allowed without using a separate calculator API.

damonge commented 3 years ago

OK, great.

For RSD in correlation functions: in principle I guess you'll need P_gg, P_gv and P_vv. If you can come up with an API for ingesting those and giving you the right correlation function, I don't see any problems with the calculator mode.

Re #815, yes, I agree. This can happen independently or as part of the calculator revamping. The result should be the same.

damonge commented 3 years ago

For anyone keeping an eye, the PR that implements this is ready: #843