Exact tests comparing to QD

PHOTOX / promdens

Promoted Density Approach for sampling initial conditions for trajectory-based nonadiabatic photodynamics.

https://github.com/PHOTOX/promdens

MIT License

0 stars 0 forks source link

Exact tests comparing to QD #20

Closed JanosJiri closed 1 month ago

JanosJiri commented 1 month ago

In this pull request, I added two tests comparing results from the code with results used in the publication and compared to QD. This test will be essential before we can start moving some important parts of the code inside InitialConditions. At the moment, the PDAW reference was not compared to QD, but I'm working on it. The PDA was compared to QD, see attached figure. The tests will need refactoring. plot

JanosJiri commented 1 month ago

@danielhollas this is a first and maybe a bit naive attempt to create a test that compares to previous calculations verified with QD. Both new tests work on my computer but I see here that they failed because of numerics. We will probably need to check some numeric accuracy. Could you help me a bit with that? I'm sure you know some smart way how to do that.

I will verify also the PDAW reference later with QD.

JanosJiri commented 1 month ago

I added new reference data for the PDAW test. This set was benchmarked against QD. Therefore, we have now tests for both PDA and PDAW that check it against QD. However, the numerics again failed, this time for PDAW (I added some preliminary rounding for PDA which made it pass the test). The test worked on my computer where I generated the data. I think the numerical difference assert np.float64(5.033296566754934e-22) == np.float64(5.033296466962208e-22) is negligible but I don't know how to correctly compare numbers within numerical accuracy. I hope for help from @danielhollas once you have a bit of time. plot

danielhollas commented 1 month ago

This is great, thanks for doing this! Definitely agree we need more "end-to-end" tests.

Comparing real numbers is always a PITA, I have always struggled with that in ABIN tests. I don't actually have experience with handling this problem in Python so we'll need to figure it out.

For now it would be good to understand where the differences come from. Can you post what Python and numpy versions do you have locally? (pip show numpy should show you the package version).

This set was benchmarked against QD. Therefore, we have now tests for both PDA and PDAW that check it against QD.

Awesome! :+1:

JanosJiri commented 1 month ago

For now it would be good to understand where the differences come from. Can you post what Python and numpy versions do you have locally? (pip show numpy should show you the package version).

I used Python 3.12.1 and numpy 1.26.2 to create the reference data.

danielhollas commented 1 month ago

I tried with Python 3.12.5 and numpy 1.26.2 but still am seeing differences locally. That's unfortunate, as it likely means that the results will be sensitive on the CPU.

It looks like numpy.testing module might be useful here, perhaps assert_allclose

JanosJiri commented 1 month ago

Thanks for having a look at the pull request @danielhollas ! I made five changes:

I added absolute paths to all files read by the test (as you advised).
I created tmp_path folders and moved output files of promdens there.
I modified the preliminary numeric test. Now I compare if the difference between two numbers is bigger than some threshold. The threshold was based on my own comparison of results generated on the same computer by numpy 1.26 and 2.1
I changed the reference data to values generated with numpy 2.1
I changed the PDA test to only 10 ICs. I again compared the code with some QD simulations with different pulses. I believe this version is correct and stable. Thus, the reference data generated should be good ones.

Let me know if you have any further suggestions and if you find some smart way to deal with the numeric tests. But with this, I will feel safer if we start moving functions around the code as we test that the physics calculated still work.

JanosJiri commented 1 month ago

Hi @danielhollas , I processed all the comments from the previous review. We now test only 10 geometries to make the test faster and numerically more stable. The output in the test is also directly stored in the tmp_path. I also created a validation folder but saved it for another pull request which is coming soon. Hopefully done for this test. Thanks for all the reviews!