Tests with baryons fail

cphyc commented 5 years ago

Hi, When running the test suite on the current version of master (4726988), all tests pass except the following two:

test_12c_modification_with_baryons
test_12d_modification_with_baryons_multilevel

The failure comes from a disagreement between the answers. I attached below the outputs of the failing tests

$ ./run_tests.sh test_12c_modification_with_baryons
Running test on test_12c_modification_with_baryons   # Baryons, single level, with modification
Testing output
Traceback (most recent call last):
  File "compare.py", line 156, in <module>
    default_comparisons()
  File "compare.py", line 151, in default_comparisons
    compare(pynbody.load(output_file[0]),pynbody.load(sys.argv[1]+"/reference_output"))
  File "compare.py", line 26, in compare
    npt.assert_almost_equal(f1['vel'],f2['vel'],decimal=compare_decimal)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 584, in assert_almost_equal
    return assert_array_almost_equal(actual, desired, decimal, err_msg)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1029, in assert_array_almost_equal
    precision=decimal)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 841, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 4 decimals

Mismatch: 0.0797%
Max absolute difference: 0.00048828
Max relative difference: 0.01034109
 x: SimArray([[  13.8939,  -58.7497,   12.8005],
          [  17.2224,  -72.6177,   23.3705],
          [  20.9602,  -81.5143,  -14.5985],...
 y: SimArray([[  13.8939,  -58.7497,   12.8005],
          [  17.2224,  -72.6177,   23.3705],
          [  20.9602,  -81.5143,  -14.5985],...

--> TEST FAILED

The IC generator output follows

# GM ICs code, compiled Oct  9 2019 15:04:31
# git HEAD:4726988
# Runtime: 2019-10-09.15:11:49

Note: 4 FFTW Threads (determined by OpenMP) were initialised
Using post 2015 CAMB transfer function
Drawing random numbers...done
overdensity: calculated value = 0.00161546
BEFORE modifications chi^2 = 263093
Delta chi^2 from linear modifications = 11442.2
AFTER  modifications chi^2 = 274535
         Total delta chi^2 = 11442.2
Propagating modification to field 1
. BEFORE modifications chi^2 = 263093
AFTER  modifications chi^2 = 274535
         Total delta chi^2 = 11442.2

Operations applied to all fields. Writing to disk...

Write, ndm=262144, ngas=262144
+ AddGasMapper, DM first:
| + MassScaledGrid of side 64 address 0x55c0177f84f0 referencing Grid of side 64 address 0x55c0177b3a50; 1 cells marked
+ AddGasMapper, gas second:
| + MassScaledGrid of side 64 address 0x55c0177f85f0 referencing Grid of side 64 address 0x55c0177b3a50; 1 cells marked
+ AddGasMapper ends
Gadget output preliminary scan...
Particle type 0: 262144 particles
Min and max particle mass : 1.3484 1.3484
Particle type 1: 262144 particles
Min and max particle mass : 6.0595 6.0595

overdensity: calculated value = 1
Clearing modification list

$ ./run_tests.sh test_12d_modification_with_baryons_multilevel 
Running test on test_12d_modification_with_baryons_multilevel   # Baryons, single level, with modification
Testing output
Traceback (most recent call last):
  File "compare.py", line 156, in <module>
    default_comparisons()
  File "compare.py", line 151, in default_comparisons
    compare(pynbody.load(output_file[0]),pynbody.load(sys.argv[1]+"/reference_output"))
  File "compare.py", line 26, in compare
    npt.assert_almost_equal(f1['vel'],f2['vel'],decimal=compare_decimal)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 584, in assert_almost_equal
    return assert_array_almost_equal(actual, desired, decimal, err_msg)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 1029, in assert_array_almost_equal
    precision=decimal)
  File "/home/ccc/anaconda3/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 841, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 4 decimals

Mismatch: 0.244%
Max absolute difference: 0.00024414
Max relative difference: 0.01499623
 x: SimArray([[  466.4706,   194.4837,    38.7908],
          [  468.8281,   194.5704,    12.4436],
          [  470.2868,   194.9073,   -18.8642],...
 y: SimArray([[  466.4706,   194.4837,    38.7908],
          [  468.8281,   194.5704,    12.4436],
          [  470.2868,   194.9073,   -18.8642],...

--> TEST FAILED

The IC generator output follows

# GM ICs code, compiled Oct  9 2019 15:04:31
# git HEAD:4726988
# Runtime: 2019-10-09.15:13:19

Note: 4 FFTW Threads (determined by OpenMP) were initialised
Using post 2015 CAMB transfer function
WARNING: Opening a zoom where flagged particles are within 3 pixels of the edge. This is prone to numerical errors.
Initialized a zoom region:
  Subbox length         = 21.3333 Mpc/h
  n                     = 64
  dx                    = 0.333333
  Zoom factor           = 3
  Num particles         = 2969
  Low-left corner in parent grid = (22, 22, 22)
  Low-left corner (h**-1 Mpc)    = 22, 22, 22
  Total particles = 419501
WARNING: maximum k in CAMB input file is insufficient
         extrapolating using naive Meszaros solution
WARNING: maximum k in CAMB input file is insufficient
         extrapolating using naive Meszaros solution
Initialized a zoom region:
  Subbox length         = 7.11111 Mpc/h
  n                     = 64
  dx                    = 0.111111
  Zoom factor           = 3
  Num particles         = 1791
  Low-left corner in parent grid = (21, 21, 21)
  Low-left corner (h**-1 Mpc)    = 29, 29, 29
  Total particles = 434261
Drawing random numbers...done
Drawing random numbers...done
Drawing random numbers...done
overdensity: calculated value = 0.00369876
BEFORE modifications chi^2 = 326922
Delta chi^2 from linear modifications = 6442.46
AFTER  modifications chi^2 = 333005
         Total delta chi^2 = 6083.32
Propagating modification to field 1
. BEFORE modifications chi^2 = 326922
AFTER  modifications chi^2 = 333005
         Total delta chi^2 = 6083.32

Operations applied to all fields. Writing to disk...Zeldovich approximation on successive levels...

Interpolating low-frequency information into zoom regions...
done.
Zeldovich approximation on successive levels...

Interpolating low-frequency information into zoom regions...
done.
Write, ndm=385904, ngas=48357
+ AddGasMapper, DM first:
| + TwoLevelParticleMapper, n_hr_per_lr=27, firstLevel2Particle=337547
| +                       , zoom.size=1791, zoomed.size=48357
| | + TwoLevelParticleMapper, n_hr_per_lr=27, firstLevel2Particle=259175
| | +                       , zoom.size=2969, zoomed.size=80163
| | | + Grid of side 64 address 0x56139986e660; 1 cells marked
| | + TwoLevelParticleMapper continues with high-res particles:
| | | + Grid of side 64 address 0x56139986f640; 1 cells marked
| | + TwoLevelParticleMapper ends
| + TwoLevelParticleMapper continues with high-res particles:
| | + MassScaledGrid of side 64 address 0x56139986f3e0 referencing Grid of side 64 address 0x561399cb8fd0; 27 cells marked
| + TwoLevelParticleMapper ends
+ AddGasMapper, gas second:
| + TwoLevelParticleMapper, n_hr_per_lr=27, firstLevel2Particle=0
| +                       , zoom.size=1791, zoomed.size=48357
| + low-res part will be skipped but notionally is:
| | + TwoLevelParticleMapper, n_hr_per_lr=27, firstLevel2Particle=259175
| | +                       , zoom.size=2969, zoomed.size=80163
| | | + Grid of side 64 address 0x56139986e660; 1 cells marked
| | + TwoLevelParticleMapper continues with high-res particles:
| | | + Grid of side 64 address 0x56139986f640; 1 cells marked
| | + TwoLevelParticleMapper ends
| + TwoLevelParticleMapper continues with high-res particles:
| | + MassScaledGrid of side 64 address 0x561399cb6df0 referencing Grid of side 64 address 0x561399cb8fd0; 27 cells marked
| + TwoLevelParticleMapper ends
+ AddGasMapper ends
Gadget output preliminary scan...
Particle type 0: 48357 particles
Min and max particle mass : 0.00184966 0.00184966
Particle type 1: 385904 particles
Min and max particle mass : 0.00831207 7.4079

Using variable-mass format
overdensity: calculated value = 0.903737
Clearing modification list
Writing to .//grid-0.txt.//grid-0.txt
Writing to .//grid-1.txt.//grid-1.txt
Writing to .//grid-2.txt.//grid-2.txt

cphyc commented 5 years ago

For information, here is my local setup:

$ g++ --version
g++ (GCC) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gsl-config --version
2.6

$ fftw-wisdom                                
(fftw-3.3.8 fftw_wisdom #x458a31c8 #x92381c4c #x4f974889 #xcd46f97e
)

$ python -c 'import pynbody; print(pynbody.__version__)' 
0.47

I am running on ArchLinux

apontzen commented 5 years ago

Thanks Corentin,

This is minor compared to what is on the tidying branch (which I recommend using) where every single test fails!

I'm currently working through each test and convincing myself the 'new' answers are more accurate than the 'old' ones. Once that's done I'll update all the reference answers so that the tests pass again.

In the meantime I'm afraid we all have to ignore the failing tests...

Cheers, Andrew

svstopyra commented 5 years ago

I did a little digging, and while I'm not sure exactly what the problem is, I can perhaps shed some light on what seems to be going on. Essentially, the current code is producing something for which a tiny fraction (0.006%) of particles are displaced from the reference ICs in the test subfolder by small amounts (the rest are exactly identical), and when computing the velocities with the Zeldovich approximation this factor is actually a couple of hundred or so, and ends up producing velocities which are different from the reference velocities by more than the threshold value of 10^-4 specified in compare.py.

The difference in the positions for these same points is actually at the 32 bit floating point machine precision level, which seems to hint that using 32 bit floats somewhere might be the source of this issue, but as far as I can tell both the reference and output for the test are in double precision. Possibly one of the dependencies is compiled in single precision, or could have been when the tests were generated, which would explain why it seems to occur on some computers and not others.

What's particularly odd is that this seems to pass the Travis build tests, but not on my computer and Corentin's. This does suggest that there's something subtly wrong, which we want to be careful to get rid of before we actually generate new tests.

apontzen commented 5 years ago

Thanks Stephen! That’s worth understanding further. Please do follow it up. I will continue fixing the other tests in the tidying branch but wait to merge until we understand the source of this disagreement. Worth trying different compilers to see if this is a compiler-dependent issue.

cphyc commented 5 years ago

As a follow-up of @svstopyra comment, we tried:

changing g++'s version (I run with 9.2.0)
changing gsl's version (2.6, and 2.5) Do you know which version of g++ and gsl are used on the continuous integration platform?

If that's of any help, I can try on a cluster I have access to some combination of

gcc 4.6 to 9.2,
gsl 1.16, 2.2-2.6,
icc (intel's compiler) 12 to 19,
fftw 2 or 3.

apontzen commented 5 years ago

I don't know the exact versions actually, it would be useful to pin these. As it is Travis installs whatever is the default in the ubuntu trusty release, with the packages listed here:

https://github.com/ucl-cosmoparticles/genetIC/blob/master/.travis.yml

cphyc commented 5 years ago

For information, I tried on the IAP cluster with the following setup:

gcc/9.2.0 and fftw/3.3.8
gcc/4.9.4 and fftw/3.3.2 Both ran successfully with gsl/2.6.0.

We also checked with @svstopyra and we have similar issues if the error is still there if we perform the analysis with yt instead of pynbody (except yt uses 64bits floating precision whereas pynbody seems to be 32bits?)

apontzen commented 5 years ago

Not sure about this - pynbody should be using bit depth that matches the file it's loading. Anyway, I appreciate all the digging; let's keep looking into this...

cphyc commented 5 years ago

@apontzen I see; yt converts everything to float64 by default, regardless of the on-disk precision.

svstopyra commented 5 years ago

After some more digging, I've now figured out what causes this. In most of the functions in camb.hpp which handle computing the power spectrum, powf was used instead of pow. This has the effect of implicitly converting everything to floats in the intermediaries of the power spectrum calculations, which created 32 bit rounding errors in the power spectrum. Because different processor architectures handle this rounding a bit differently, some computers were producing the same results as the test, and others weren't. I've checked that just swapping in the powf for pow fixes this issue. In any case, this means that pretty much all the tests are wrong, so I'll prepare a pull request for the tidying branch since there's no point changing all the tests on master until we're happy with them on tidying. While I'm at it, I'll do as systematic check for any other powf's or similar float functions that might still be hiding in the code...

pynbody / genetIC

Tests with baryons fail #65