MIC-DKFZ / nnUNet

Apache License 2.0
5.91k stars 1.76k forks source link

Can't initialize NVML #2573

Closed lannight93816 closed 1 week ago

lannight93816 commented 2 weeks ago

I was training the MSD data these days. I get two folds trained yesterday and there was no problem. However, when I try to train another fold this morning, it turns out a warning said "Can't initialize NVML", "CUDA driver initialization failed, you might not have a CUDA gpu", "toch.cuda.amp.GradScaler is enabled, but CUDA is not available." (sorry for not pasting the log, I ran all these on the school server and it is maintaining now.)

I know it may be the problem of pytorch or my gpu. However, I wonder was there anyone else came across this problem before and how to solve it. Thanks a lot!