Closed dmlpt closed 3 years ago
Hi @dmlpt,
We used cuda 11.0 to install apex. From my understanding of the link you sent, looks like a user is having trouble installing pytorch to support cuda 11.0 instead of cuda 10.2, but looks like you already have pytorch installed to support cuda 11.0.
Could you run nvidia-smi
and share your output to confirm that you have cuda 11.0 installed on your system (not just the pytorch support)?
Thanks for the quick response. Here is the output of nvidia-smi
I see that the nvidia-smi shows cuda-11.3 as the default cuda but I also tried specifying cuda-11.0 in the installation command as below (I also see that /usr/local/cuda-11.0/bin/nvcc is used during installation)
CUDA_HOME=/usr/local/cuda-11.0 pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
but still having the same issues.
Did you use the above command to install apex? Also, did you use a specific branch from apex?
Thanks!
I was able to fix it finally. It was an issue with the latest commit https://github.com/NVIDIA/apex/issues/1155
This solution worked for me : https://github.com/NVIDIA/apex/issues/1155#issuecomment-912389866
Thanks for the assistance!
@dmlpt glad you were able to fix it! thank you for posting the solution for others as well
Hi @MidoAssran, others who were successfully able to install this repo
Thanks for the great code. I am having issues installing apex with cuda extension. Here is my python, pytorch, torchvision and torch.cuda versions:
Python : 3.8.11 torch : '1.7.1' torchvision : '0.8.2' torch.version.cuda : '11.0'
I think the initial error was due to cuda version mismatch between apex and pytorch as mentioned here So I followed the instructions in the above link but still not able to install it. I think my python, pytorch and cuda version matches with the requirements specified in this repo. a) Could you please let me know how did you install apex? b) Which cuda version was used to install apex?
Thanks in Advance.