Closed yuhaozhang97 closed 6 years ago
The problem is from my select_gpu
function.
Can you provide the output of running nvidia-smi
in the command line on your machine?
As long as your machine has at least one GPU detected by nvidia-smi
the function shouldn't fail.
Thanks for your quick reply!
Here is what I got: "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running."
@yuhaozhang666 Then the problem is on your side. Your NVIDIA driver or CUDA toolchain installation is likely broken. No GPU-dependent deep learning program can run in that situation.
Do you mean it must be run on GPU @cai-lw
@yuhaozhang666 Yes. I am sorry about that but PyTorch 0.2.0 cannot switch between GPU and CPU easily.
Okay thank you so much, I'll try to launch a GPU capable instance to run your program!
Hi Liwei,
I got this error when I tried to pretrain the model, is it an error due to Nvidia?
Traceback (most recent call last): File "pretrain.py", line 16, in
torch.cuda.set_device(select_gpu())
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py", line 261, in set_device
if device >= 0:
TypeError: '>=' not supported between instances of 'NoneType' and 'int'
Thanks!