facebookresearch / suncet

Code to reproduce the results in the FAIR research papers "Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples" https://arxiv.org/abs/2104.13963 and "Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations" https://arxiv.org/abs/2006.10803
MIT License
486 stars 67 forks source link

Installation issue (apex) #24

Closed dmlpt closed 3 years ago

dmlpt commented 3 years ago

Hi @MidoAssran, others who were successfully able to install this repo

Thanks for the great code. I am having issues installing apex with cuda extension. Here is my python, pytorch, torchvision and torch.cuda versions:

Python : 3.8.11 torch : '1.7.1' torchvision : '0.8.2' torch.version.cuda : '11.0'

I think the initial error was due to cuda version mismatch between apex and pytorch as mentioned here So I followed the instructions in the above link but still not able to install it. I think my python, pytorch and cuda version matches with the requirements specified in this repo. a) Could you please let me know how did you install apex? b) Which cuda version was used to install apex?

Thanks in Advance.

MidoAssran commented 3 years ago

Hi @dmlpt,

We used cuda 11.0 to install apex. From my understanding of the link you sent, looks like a user is having trouble installing pytorch to support cuda 11.0 instead of cuda 10.2, but looks like you already have pytorch installed to support cuda 11.0.

Could you run nvidia-smi and share your output to confirm that you have cuda 11.0 installed on your system (not just the pytorch support)?

dmlpt commented 3 years ago

Thanks for the quick response. Here is the output of nvidia-smi

image

I see that the nvidia-smi shows cuda-11.3 as the default cuda but I also tried specifying cuda-11.0 in the installation command as below (I also see that /usr/local/cuda-11.0/bin/nvcc is used during installation)

CUDA_HOME=/usr/local/cuda-11.0 pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

but still having the same issues.

Did you use the above command to install apex? Also, did you use a specific branch from apex?

Thanks!

dmlpt commented 3 years ago

I was able to fix it finally. It was an issue with the latest commit https://github.com/NVIDIA/apex/issues/1155

This solution worked for me : https://github.com/NVIDIA/apex/issues/1155#issuecomment-912389866

Thanks for the assistance!

MidoAssran commented 3 years ago

@dmlpt glad you were able to fix it! thank you for posting the solution for others as well