pynbody / genetIC

The GM initial conditions generator
GNU General Public License v3.0
21 stars 8 forks source link

Fix chi square #32

Closed Martin-Rey closed 6 years ago

Martin-Rey commented 6 years ago

This is an attempt to fix the calculation of chi-square (issue #30 ). The order of magnitude is roughly the number of degrees of freedom of the field, both for a single grid and a zoomed architecture. It is not exactly the value though, varying around it (by about 0.1%) depending on the random seed.

However, I do not recover the value of chi^2 when calculated solely from modifications. From the original GM paper, I understand this was checked and found to perfectly match with the chi^2 from the field. Was this done with the genetIC implementation or the toy-model ?

In the process, I introduced a new way of testing by comparing the log file to a reference.txt file. This allows to check calculated values that are printed to screen like chi-squares or modification values.

apontzen commented 6 years ago

Yes this was checked in the actual code but don't forget you may be looking at numerical accuracy issues. You prob want to test it on a case with a relatively small grid so that the Delta chi^2 is not too tiny compared with the actual chi^2.

Martin-Rey commented 6 years ago

Ok, I am slightly confused by this problem. Here is what I have figured out:

  1. The 1/n factor is likely to come from a FFT normalisation problem. I assume as some point in the code, the convention changed from 1/n to 1/sqrt(n). This can be seen here: https://github.com/ucl-cosmoparticles/genetIC/blob/5c510cf2d4ce400509f67625cea32e72c3f58d40/genetIC/src/tools/numerics/fourier.hpp#L418-L419 This means that when squaring the density field, there is a missing 1/n factor.

  2. The same factor of 1/n required by chi^2 calculation is present when comparing the power spectrum from the density field and the stored theoretical power spectrum (line 94): https://github.com/ucl-cosmoparticles/genetIC/blob/5c510cf2d4ce400509f67625cea32e72c3f58d40/genetIC/src/cosmology/parameters.hpp#L48-L99

  3. Nonetheless, all calculations producing scalar values from the density field, e.g. overdensities/variance etc are correct in absolute value. They do not miss a 1/sqrt(n) factor.

It is slightly unclear how everything fits into one picture if normalisation is inconsistent. We might get away with it when calculating modifications since those are orthonormalised anyway.

Any thoughts welcome.

Martin-Rey commented 6 years ago

Further commits include :

However, the chi^2 from modifications and from the field now mismatch exactly by a factor 1/N that I would not have expected. I am really struggling to build a coherent picture in my mind of when do we need this 1/N normalisation. The failing tests are just a representation of this problem.

Martin-Rey commented 6 years ago

Ok I finally got to the bottom of this I think. This is a pretty important change even though it does not break anything.

Previously, the FFT normalisation was changed from 1/N to 1/sqrt(N). To balance this, an extra sqrt(N) was added to the white noise (from master branch): https://github.com/ucl-cosmoparticles/genetIC/blob/5c510cf2d4ce400509f67625cea32e72c3f58d40/genetIC/src/simulation/field/randomfieldgenerator.hpp#L120-L129

However, adding this extra normalisation to the white noise rather than to the power spectrum means that calculations of scalar values from the field and power spectrum needed to be corrected. Examples of such calculations are the estimated power spectrum from the field or chi^2. Modifications calculations were not impacted since the orthonormalisation procedure multiplies and divide by the power spectrum, taking out such issues.

The last commit transfers the FFT normalisation from the random draw to the stored power spectrum. This restores coherence and avoid spurious 1/N factors to be scattered everywhere in the code. The normalisation is now dealt here and here only. https://github.com/Martin-Rey/genetIC/blob/0ae8c049d96a25d5f773a3361e52b83f92c45e7f/genetIC/src/cosmology/camb.hpp#L170-L182

The calculation of chi^2 now agrees with ~DOF and delta chi^2 from field and modifications are equal up to numerical precision.

The fact that the tests are failing is only because the reference outputs are now wrong for power spectrum testing. All tests without power spectrum testing pass, including the different seeding procedure. If you agree with this change, I will modify the reference output and this should be ready for merge.

Martin-Rey commented 6 years ago

The last commits fix the failing tests, one because of power spectrum units between the old/new normalisation and the other because of a changed parameter file in #33 .