Quijote 1 Gpc/h inference

maho3 commented 3 months ago

This issue is solely to keep notes for analyzing 1 Gpc/h inference using Quijote 1 Gpc/h ground-truths.

Notebook: https://github.com/maho3/ltu-gobig/blob/run_experiments/experiments/quijotelike/inference_quijote.ipynb

maho3 commented 3 months ago

Configuration

Training suite: ltu-cmass, BORGPM + CHARM, SS=3, B=2, IC-matched
Test suite: Quijote halos, M200c >= 10^13 Msol/h

P(k) monopole coverage

PlotSinglePosterior

Posterior Coverage

Notes

We are systematically overestimating Omega_m and sigma8
When the lampe posteriors move out of support (at high Om, s8) we return the prior. Following https://github.com/maho3/ltu-ili/issues/151
This is because our ltu-cmass BORGPM+CHARM emulations are underestimating the monopole P(k)
That CHARM bias is probably because of our ill-specified z=99 transfer

bwandelt commented 3 months ago

is this because CHARM was trained on fastpm sims rather than borgpm?

maho3 commented 3 months ago

Yes, that is the case, and would explain the power bias of the halo field. By giving it a z=99 field with too much power (as in the BORG case), the halo emulator would infer less evolution, and thus less clustering.

bwandelt commented 3 months ago

that would explain why sigma_8 is underestimated. But omega_m seems to be more robust...

maho3 commented 3 months ago

I discovered today that this^ was probably not primarily driven by a CHARM error, but instead because CHARM halos were trained to recover the statistics of Rockstar halos in Quijote. However, we were using the Quijote FoF catalog as a reference, as in https://github.com/maho3/ltu-gobig/blob/main/scripts/preprocess_existing_sims/quijote.py

I am currently converting the Quijote Rockstar catalog to our framework, for a better comparison

maho3 commented 3 months ago

This was the issue! Testing now on the Quijote Rockstar catalogs, we have:

Configuration

Training suite: ltu-cmass, FASTPM + CHARM, SS=3, B=2, IC-matched
Test suite: Abacus halos, M200c >= 10^13 Msol/h

Coverage

Loss

$k_{\rm min} = 0.1$ inference

$k_{\rm min} = 0.2$ inference

$k_{\rm min} = 0.4$ inference

$k_{\rm min} = 0.6$ inference

Notes

The inference breaks at or below the Nyquist frequency, $k_{\rm nyq}$ = (2pi/L)(N/2)~0.4 h/Mpc
For the $k_{\rm min}=0.2$ inference, there is a slight under-prediction of Om and sigma8, especially evident in the posterior coverage. This is likely because of a slight underprediction of the CHARM halo power This may be ameliorated by better subgrid positioning or retraining CHARM.

maho3 commented 3 months ago

Full hod inference at kmin=0.2! This is done training on 10 hod params per fastpm simulation, for 2000 simulations.

All the test points are quijote halos populated with the fiducial hod configuration, hence the small range of scales in the full coverage plots.

There's still something odd going on with the sigma8 predictions, I assume because the model is picking up at the CHARM discrepancy at high-k, like previously. I assume when we remedy charm, the sigma8 and HOD constraints will improve.

download (1).png

tlmakinen commented 1 month ago

From the LtU meeting comparing Pinnochio and CHARM Pk(halos) inference on Quiojote DM test set. Comparing different k_max cuts

tlmakinen commented 1 month ago

some poor coverage tests for CHARM (albeit for incorrect min_logM setting):

for k_max=0.05 (not great):

for k_max=0.2 (worse):

tlmakinen commented 1 month ago

Did a cross-check of the analysis and re-did the coverage tests for the CHARM simulations at different k_max cuts. The test suite here is comprised of 500 Quijote simulations.

for k_max=0.05:

for k_max=0.2

tlmakinen commented 1 month ago

for reference, we can check inference coverage of pinnocchio tested on quijote to see how cosmology is captured on small scales

for k_max=0.05:

remark: pinnocchio does ok on large scales

for k_max=0.2

NOTE that the coverage here failed (unable to sample properly over the full test suite, which implies that pinnochio lines up poorly with quijote). A sketch of the poor coverage for 88 test simulations:

remark: pinnocchio does poorly on smaller scales

maho3 / ltu-gobig