Closed MingfuYAN closed 1 year ago
@mingfuyan We only test this docker image on 2080Ti and V100, the NVIDIA Linux Driver should be 470.182.03 . The following command can be used to creat a docker instance.
docker run -it --gpus all --shm-size=32g -v /home/user:/root --name cuda11.1 f90d66fc0efb bash
First of all, thank you very much for your excellent work. When I run the code using the docker image you provided, the following error occurs.
RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:38, unhandled cuda error, NCCL version 2.7.8 ncclUnhandledCudaError: Call to CUDA function failed.
Here is some of my cuda version information.