Closed zhangyongsdu closed 3 years ago
Thanks a lot for reporting the bug, we encounter the same issue and are trying to fix it.
Got an error when unittest on /source/tests , latest devel branch cuda assert: an illegal memory access was encountered /tmp/pip-req-build-1dcl1ksu/source/lib/include/gpu_cuda.h 108
@zhangyongsdu We have fixed the bug by PR #545.
I got a error when run MD simuation with the API branch deepmd-kit. the error looks like: cuda assert: invalid argument /scratch/qf9/yxz565/softwares/deepmd-kit-api-20210417/source/lib/include/gpu_cuda.h 48.
I used 4 V100 GPU (mpirun -np 4) with cuda/10.1, cudnn/7.6.5-cuda10.1, nccl/2.6.4-1+cuda10.1 and openmpi/4.0.1. This error also occurs for cuda 11, cudnn 8. The error does not occur for the API brach before 20th March 2021.