Closed Carrotkingdom closed 3 years ago
Will this problem occur when running on the CUDA platform?
Have you tested these models on a single GPU or DCU environment?
Have you tested these models on a single GPU or DCU environment?
about 12300 on DCU
I would suggest trying the latest commit on devel
I will try it. > I would suggest trying the latest commit on devel
Have you tested these models on a single GPU or DCU environment?
Same results on a single GPU. On CPU the process is normal and won't be killed.
The latest GPU code has been tested on ehpc and there is no problem @amcadmus.
The latest DCU code on NSCC-ZZ is tested successfully @amcadmus .
I am running a DP model with deepmd-kit-2.0.0.beta0 version, both original and compressed, on GPU and DCU. In all cases the process is killed after several steps, depending on the number of threads and environment.
Here are the model type, machine type, the number of threads and the corresponding step number before the process died.
The lammps input file and the submit files on GPU and DCU are attached below.
MgZn.zip