thomasbrandon / mish-cuda

Mish Activation Function for PyTorch
MIT License
147 stars 67 forks source link

How to install mish-cuda when cuda is 11.1? #8

Open yunxi1 opened 3 years ago

yunxi1 commented 3 years ago

my GPU is RTX 3090,so I have to use cuda 11, I already checked my cuda11.1 and it is useful, but when I use : pip install git+https://github.com/thomasbrandon/mish-cuda/ to insall mish-cuda, there is a error:

unable to execute ':/usr/local/cuda/bin/nvcc': No such file or directory error: command ':/usr/local/cuda/bin/nvcc' failed with exit status 1

ERROR: Failed building wheel for mish-cuda

what should I do?

thomasbrandon commented 3 years ago

That colon shouldn't be there. Looks like it's part of the CUDA path detected by torch.utils.cpp_extension (which does the compilation). Check if CUDA_HOME environment variable is set and verify value. Otherwise check which nvcc results. Those are the methods used for detection. Otherwise you'll have to look at the detection/compilation logic to see what's going wrong. See torch/utils/cpp_extension.py#L27.

rafale77 commented 3 years ago

I have been running on cuda11.1 using pytorch1.6 but the compiled file fails with pytorch1.7 which officially supports cuda11. I am getting a function import failure with an unrecognized character from the compiled library so I downgraded back to pytorch 1.6...

This is the error I get:

ImportError: ~/.local/lib/python3.7/site-packages/mish_cuda/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c104impl23ExcludeDispatchKeyGuardC1ENS_11DispatchKeyE

thomasbrandon commented 3 years ago

@rafale77 Did you re-install the extension after upgrading PyTorch. I wouldn't expect binary compatibility across versions so you need to re-install to re-compile.

rafale77 commented 3 years ago

Yes, I did, I tested it both ways: without recompiling, and with recompiling on pytorch 1.7, Same failure. Compiled by pytorch 1.6 and 1.7, both work fine on pytorch 1.6.

Edit: As I wrote the line above, I started suspecting a user error... that the recompiling actually did not occur because the original binary was not overwritten so I redid it after uninstalling it first and it seems to have addressed the issue.

yunxi1 commented 3 years ago

I change my cuda to 11.0, then set environment variable export CUDA_HOME=/usr/local/cuda new error appeared: nvcc fatal : Unsupported gpu architecture 'compute_86' error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1 so I add environment variable:export TORCH_CUDA_ARCH_LIST="7.5" downgrade version it works!

Sukeysun commented 2 years ago

hi, I tried: -export CUDA_HOME=$CUDA_HOME:/usr/local/cuda +export CUDA_HOME=/usr/local/cuda it works

meanmee commented 2 years ago

hi, I tried: -export CUDA_HOME=$CUDA_HOME:/usr/local/cuda +export CUDA_HOME=/usr/local/cuda it works It really works bro!