Closed cphyc closed 4 years ago
For information, here is my local setup:
$ g++ --version
g++ (GCC) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gsl-config --version
2.6
$ fftw-wisdom
(fftw-3.3.8 fftw_wisdom #x458a31c8 #x92381c4c #x4f974889 #xcd46f97e
)
$ python -c 'import pynbody; print(pynbody.__version__)'
0.47
I am running on ArchLinux
Thanks Corentin,
This is minor compared to what is on the tidying
branch (which I recommend using) where every single test fails!
I'm currently working through each test and convincing myself the 'new' answers are more accurate than the 'old' ones. Once that's done I'll update all the reference answers so that the tests pass again.
In the meantime I'm afraid we all have to ignore the failing tests...
Cheers, Andrew
I did a little digging, and while I'm not sure exactly what the problem is, I can perhaps shed some light on what seems to be going on. Essentially, the current code is producing something for which a tiny fraction (0.006%) of particles are displaced from the reference ICs in the test subfolder by small amounts (the rest are exactly identical), and when computing the velocities with the Zeldovich approximation this factor is actually a couple of hundred or so, and ends up producing velocities which are different from the reference velocities by more than the threshold value of 10^-4 specified in compare.py.
The difference in the positions for these same points is actually at the 32 bit floating point machine precision level, which seems to hint that using 32 bit floats somewhere might be the source of this issue, but as far as I can tell both the reference and output for the test are in double precision. Possibly one of the dependencies is compiled in single precision, or could have been when the tests were generated, which would explain why it seems to occur on some computers and not others.
What's particularly odd is that this seems to pass the Travis build tests, but not on my computer and Corentin's. This does suggest that there's something subtly wrong, which we want to be careful to get rid of before we actually generate new tests.
Thanks Stephen! That’s worth understanding further. Please do follow it up. I will continue fixing the other tests in the tidying branch but wait to merge until we understand the source of this disagreement. Worth trying different compilers to see if this is a compiler-dependent issue.
As a follow-up of @svstopyra comment, we tried:
g++
's version (I run with 9.2.0)gsl
's version (2.6, and 2.5)
Do you know which version of g++
and gsl
are used on the continuous integration platform?If that's of any help, I can try on a cluster I have access to some combination of
I don't know the exact versions actually, it would be useful to pin these. As it is Travis installs whatever is the default in the ubuntu trusty release, with the packages listed here:
https://github.com/ucl-cosmoparticles/genetIC/blob/master/.travis.yml
For information, I tried on the IAP cluster with the following setup:
We also checked with @svstopyra and we have similar issues if the error is still there if we perform the analysis with yt instead of pynbody (except yt uses 64bits floating precision whereas pynbody seems to be 32bits?)
Not sure about this - pynbody should be using bit depth that matches the file it's loading. Anyway, I appreciate all the digging; let's keep looking into this...
@apontzen I see; yt converts everything to float64 by default, regardless of the on-disk precision.
After some more digging, I've now figured out what causes this. In most of the functions in camb.hpp which handle computing the power spectrum, powf was used instead of pow. This has the effect of implicitly converting everything to floats in the intermediaries of the power spectrum calculations, which created 32 bit rounding errors in the power spectrum. Because different processor architectures handle this rounding a bit differently, some computers were producing the same results as the test, and others weren't. I've checked that just swapping in the powf for pow fixes this issue. In any case, this means that pretty much all the tests are wrong, so I'll prepare a pull request for the tidying branch since there's no point changing all the tests on master until we're happy with them on tidying. While I'm at it, I'll do as systematic check for any other powf's or similar float functions that might still be hiding in the code...
Hi, When running the test suite on the current version of master (4726988), all tests pass except the following two:
The failure comes from a disagreement between the answers. I attached below the outputs of the failing tests