Closed XuFanffei closed 1 year ago
Both two links are inaccessible.
Please see the discussion in #2270.
If you use the GPU version, the major reason is that there are atomicAdd()
functions in the CUDA code. This function does not have a definite execution order between different CUDA threads and different runs can thus lead to different results. MD simulation is chaotic and two phase trajectories can deviate from each other quickly with increasing time and finally become totally different.
If you use the GPU version, the major reason is that there are
atomicAdd()
functions in the CUDA code. This function does not have a definite execution order between different CUDA threads and different runs can thus lead to different results. MD simulation is chaotic and two phase trajectories can deviate from each other quickly with increasing time and finally become totally different.
Thank you for your reply!
I wonder if there is a way around this, or if this deviation of the trajectory is acceptable in physics.
This randomness feature is indeed not good for debugging purposes (for developers), but it is not important for practical applications, where one usually intentionally introduces randomness in the initial conditions (different seeds for initializing velocities) to more diversely sample the phase space.
Deterministic calculations in GPU can only be obtained by changing the algorithms to avoid atomic summations over floating-point numbers
Feel free to reopen the issue if you have more questions.
Summary
For the same input file, running it 20 times yields different results each time. When testing with the Tersoff potential function on Bohrium, it is found that the results are the same for identical input files. Could you please explain why this is happening?
DeePMD-kit Version
2.1.5
TensorFlow Version
null
Python Version, CUDA Version, GCC Version, LAMMPS Version, etc
No response
Details
https://dp-devops.oss-cn-beijing.aliyuncs.com/temp/log1.lammps https://dp-devops.oss-cn-beijing.aliyuncs.com/temp/log2.lammps This is the result of two attempts, the in-file is the same.