alessiospuriomancini / cosmopower

Machine Learning - accelerated Bayesian inference
https://alessiospuriomancini.github.io/cosmopower
GNU General Public License v3.0
59 stars 26 forks source link

TE emulation accuracy issues #17

Closed alexander-reeves closed 1 year ago

alexander-reeves commented 1 year ago

Dear Dr Spurio Mancini,

Firstly, many thanks for making comsopower so easy to use- it's a really nice package! I am having some trouble with getting a good accuracy for a CMB TE emulator and was wondering if you could give me some pointers on how to improve this. The outline of the notebook I am using to train is as follows:

1) Load in training/test data produced with class - this is ~600000 data points from a sobol sequence in a parameter space [h, omega_m, omega_b, n_s, sigma8, tau_reio, A_lens and m_nu]. 2) Define the cosmopower PCA and use 512 components (following the prescription in the paper) using cosmopower_PCAplusNN 3) Train the model with the following specifications: (again tried to match up batch size and LR with the description in the paper but maybe there is something I am missing here? )

cooling schedule

                validation_split=0.1,
                learning_rates=[1e-2, 1e-3, 1e-4, 1e-5, 1e-6],
                batch_sizes=[1024, 2048, 4096, 10000, 50000],
                gradient_accumulation_steps = [1, 1, 1, 1, 1],
                # early stopping set up
                patience_values = [100,100,100,100,100],
                max_epochs = [1000,1000,1000,1000,1000],
                )

4) Test the trained model against the test data. In this stage, it is clear to see that whilst the NN behaves well for some input parameters it is way off for others (see attached plot in the style of your example notebooks)

I am aware that introducing A_lens and m_nu means I would need probably more training data compared to what you have in the LCDM set-up so I am currently producing this but I am wondering if you notice something else that I can change to improve this accuracy.

Best wishes and many thanks in advance,

Alex Reeves examples_reconstruction_PP.pdf

alessiospuriomancini commented 1 year ago

Hi @alexreevesy , thanks for your detailed question and for your interest in CP Have you tried using a (relatively) small, fixed batch size - e.g. 1024, as in your first learning step? Let me know if this helps

dpiras commented 1 year ago

Hi Alex,

many thanks for getting in touch, and for your interest in cosmopower!

The problem you are describing is definitely harder, so having played with TE spectra a bit I am not surprised the emulator might require more data, or a better preprocessing, or both. On top of what @alessiospuriomancini suggested above, a few possible more directions to explore:

Let us know if any of this is helpful!

alexander-reeves commented 1 year ago

Dear all,

Many thanks for getting back to me!

I tried with the smaller batch size and whilst the best validation loss decreased slightly (0.016->0.014) there is still quite a lot of discrepancy with the spectra. @dpiras Many thanks for your suggestion- it definitely makes sense to try to reproduce results in the vanilla LCDM case. I am going to produce some training data to try this and in parallel, I will play around with the preprocessing/ n_pca to see if something can help. It seems the TE spectrum, in particular, is difficult as I have had no problems with the EE or TT with the same set of parameters and training data (-perhaps to do with the oscillations around 0 and the large dynamic range of the data when no preprocessing is applied?)

I'll keep you updated on the above.

Cheers,

Alex

alexander-reeves commented 1 year ago

One other thing that would be helpful if possible is to have a better idea of what accuracy I should aim for- would it be possible to send a file containing the SO errors + Cosmic Variance s.t. I can reproduce the plots in the paper or just an idea of the level of agreement with CAMB/CLASS you achieved for TE (O(%) O(0.1%) etc.). Many thanks:)

dpiras commented 1 year ago

It seems the TE spectrum, in particular, is difficult as I have had no problems with the EE or TT with the same set of parameters and training data (-perhaps to do with the oscillations around 0 and the large dynamic range of the data when no preprocessing is applied?)

I agree it is not too surprising that the TE spectrum requires some different handling, exactly for the reasons you listed - let us know if you find better prepreocessing steps, and how you get along with your tests.

One other thing that would be helpful if possible is to have a better idea of what accuracy I should aim for- would it be possible to send a file containing the SO errors + Cosmic Variance s.t. I can reproduce the plots in the paper or just an idea of the level of agreement with CAMB/CLASS you achieved for TE (O(%) O(0.1%) etc.). Many thanks:)

Unless I'm missing something, I think you can get the SO TE errors by adapting what is done in this notebook with Eq. (13) in the cosmopower paper. I vaguely remember a >=1% level of agreement for TE spectra with respect to CLASS, but we focussed on the significance with respect to instrumental noise and cosmic variance, showing that what we achieve is enough not to alter the final contours. In any case, I would expect that there shouldn't be any discrepancy visually, so in the plot you shared there is something that should be improved 😕

dpiras commented 1 year ago

I was abe to retrieve this example from same tests we ran back in the day. As you can see, the agreement for single spectra should be essentially perfect, at least visually. One thing that I noticed is that this TE spectrum is about 6 orders of magnitude smaller than in the image you shared, so I think we used different units than you. It is probably worth trying a different scaling to check if the PCA+NN works better. image

alexander-reeves commented 1 year ago

Hi Davide,

Thanks a lot for this - I'm sorry I didn't see that notebook so I can now follow the prescription there (though I'm not sure actually how to get the TE noise seems that unless I am missing something only the TT and EE are listed in the SO noise models github?). I have been playing around with different pre-processing I found that taking the cube root of the data before taking the PCA seemed to help quite a bit (I now get a much better match but still not enough to reproduce the Planck constraints- see plot below).

Best wishes,

Alex

examples_reconstruction_pp.pdf

dpiras commented 1 year ago

@alexreevesy great to see some improvement! I think you are on the right track, and might have to push with some similar preprocessing and probably just a bit more data/more training. If you implement a preprocessing that guarantees positiveness, you can also take a simple log and/or try the direct NN approach (without PCA) to check if results improve.

If I remember correctly, the error on TE spectra can be calculated using only TT and EE signals and noise curves - see screenshot of Eq. (13) from the paper. image

Hope this helps!

alexander-reeves commented 1 year ago

Hi Davide, many many thanks for all of your help so far- I am scratching my head a little about this problem! I tried with more training data but the gain was minimal (again the spectra look kind of correct but there are these features at the maxima and minima which are not present in the plot you shared). To double-check the TE implementation I would like to reproduce your training results and therefore I would ask for access to the TE training data you used in the paper- it is mentioned that this is available on Zenodo but I can't seem to find this after searching the zenodo database... sorry if I am being completely blind but would you please send a link to the page where I can download the data?

dpiras commented 1 year ago

Hi Alex, I think @alessiospuriomancini is in a better position to point to Zenodo files.

In the meantime, did you try to plot the percentiles, and do they look completely off?

alexander-reeves commented 1 year ago

I did - thank you for the information about the error calculation! Here you see the results for the TT, EE and TE: for the EE and TT this was a ''proof of concept'' run with a limited training set (O~300000 points) so the accuracy is not quite as good as in the paper but the training was looking good so I think with a bit more data this would be more than good enough for reproducing the Planck chains. For the TE the story is different with much larger errors- these are the results after O(~1,500,000 points) and 8 hours of training for the pipeline where I have taken the cbrt before doing the PCA (which outperformed taking the raw PCA as before). (order is TE, TT, EE) Screenshot 2023-02-23 at 15 43 14 Screenshot 2023-02-23 at 14 34 50 Screenshot 2023-02-23 at 14 34 35

dpiras commented 1 year ago

I see, thanks! I think it is worth investigating whether with your implementation you can obtain a similar accuracy as in the paper on data without the extra parameters. Since it seems that you have a fast pipeline to generate data, it might even be easy for you to generate ~5*10^5 training data without the extra parameters and test if you can obtain results as in the CP paper?

Then, with the extra parameters, I wouldn't give up in any case - I think it is worth exploring other preprocessing functions, and also trying with the direct NN (without PCA).

alexander-reeves commented 1 year ago

Yes exactly!- This is what I am trying right now: a pure LCDM reconstruction with my own generated data- I will let you know how I get on! Thanks, Alex

alexander-reeves commented 1 year ago

Dear all, after trying a bunch of different things I finally got something that works- but I am a little unsure why! I went back to exactly the set-up described in the Cosmopower paper and recovered the results in the LCDM case. The differences in my set-up were that I used a Sobol sequence instead of a Latin hypercube to sample the parameter space (which I think should make no difference but I need to double check) and the second being the parameterization: I was using Omega_m and sigma8 instead of omega_cdm and A_s. It turns out that when I use the latter parameterization I get a much higher accuracy emulator compared to my original parameterization (perhaps the response to these parameters is more smooth compared to Om?). Using this parameterization I was able to get a good accuracy emulator including M_nu and A_lens! (see attached plot).

Many many thanks for all of your help I think you can consider this issue closed but I am happy to update via private correspondence once I've looked into this parameterization issue.

Best wishes,

Alex accuracy_emu_TE_wide.pdf

dpiras commented 1 year ago

Hi Alex, great to hear that it works! I am also unsure which exact ingredient makes the difference here; I doubt there should be much difference between the Sobol sequence and Latin hypercube, while I honestly do not really know if there is a physical reason for better results with A_s and omega_cdm vs sigma_8 and Omega_m.

It would be nice to conduct an ablation study where we see exactly what piece makes the difference here. Let us know if you run this test, and your findings! Feel free to also send us an email at any point.

In the meantime, I am going to close this issue. Feel free to reopen it if needed!