GOMC-WSU / GOMC

GOMC - GPU Optimized Monte Carlo is a parallel molecular simulation code designed for high-performance simulation of large systems
https://gomc-wsu.org
MIT License
76 stars 36 forks source link

GPU and CPU trajectories with multiparticle still are not matching #182

Closed jpotoff closed 3 years ago

jpotoff commented 4 years ago

I pulled the development branch, compiled and ran the multiparticle move on the GPU. It seems the GPU code is still not following the same trajectory as the CPU code: input.zip

gpu_cpu_nb_e

LSchwiebert commented 4 years ago

If you look closely at the energy calculations, the recip, real, and inter box energies differ slightly between runs. It is large enough that it seems like something is still not quite right (on the order of 1x10-6), but with Younes' latest patches, small enough that the energy is very close across runs. Move acceptance is also very similar. Without electrostatics, I'm getting matching results with the water simulation. These patches improved things significantly, but I'm going to dig into the Ewald calculations some more to see if I can find the cause of the differences.

jpotoff commented 4 years ago

After the fixes, CPU and GPU seem to be matching for NVT 1000 molecule isobutane. However, the argon with 73 atoms still differs. I will continue working on it tomorrow.

gpu

@YounesN Can you zoom in on the energies so we can see what's going on after the system equilibrates?

YounesN commented 4 years ago

I am not sure how you output the step vs energy and if there is an easier way, but I set the block average to 1 (every step) and then awk just the energy. Let me know if there was an easier way. I have attached both CPU and GPU data so you can take a close look.

isobutane_multi.zip

YounesN commented 4 years ago

Here is the plot for Argon that doesn't match. ar_73.zip

gpu

jpotoff commented 4 years ago

@YounesN you can get the instantaneous energy at each step from the console output by using awk to search for "ENER_0" and grab column 3. It's pretty much the same as what you are doing by setting the block averages to "1".

I zoomed in on your isobutane data and it looks great. What compiler are you using?
cpu_gpu

YounesN commented 4 years ago

I am using Intel compiler 18.

LSchwiebert commented 4 years ago

With electrostatics, here is what I see if I print the boxEnergy real and inter values at the end of CalculateEnergy::BoxForce. The first line is printed so that you can see it is from the start of the simulation. You can see that they match exactly for the CPU but do not match for the GPU and do not match across runs. The results match exactly when the system is initialized, with no round off to at least 13 decimal places. So I think there is something not being initialized or calculated correctly even at the start of the simulation when running on the GPU. This does not happen without electrostatics.

CPU Run 1: Printed combined psf to file GPU_EQ_NVT_merged.psf Box force energies: LJ = 6012183.127193; Real = -27637.662263 Box force energies: LJ = 5355307.262692; Real = -95533.523908

CPU Run 2: Printed combined psf to file GPU_EQ_NVT_merged.psf Box force energies: LJ = 6012183.127193; Real = -27637.662263 Box force energies: LJ = 5355307.262692; Real = -95533.523908

GPU Run 1: Printed combined psf to file GPU_EQ_NVT_merged.psf Box force energies: LJ = 6012183.127193; Real = -27637.662263 Box force energies: LJ = 5355306.337850; Real = -95533.213118

GPU Run 2: Printed combined psf to file GPU_EQ_NVT_merged.psf Box force energies: LJ = 6012183.127193; Real = -27637.662263 Box force energies: LJ = 5355307.076555; Real = -95532.751808

YounesN commented 3 years ago

I think at this point we can safely assume that the problem is round off error. @LSchwiebert concluded that the problem with rotate move is also related to the accuracy problem and a slight difference in some cases can cause acceptance on CPU and rejection on GPU. I will go ahead and close this issue. If we see more evidence that contradicted our assumption here, we can open another issue to fix the problem.