sail-sg / Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Apache License 2.0
756 stars 64 forks source link

RuntimeError: The detected CUDA version (12.2) mismatches the version that was used to compile PyTorch (11.8). #46

Closed trungpx closed 7 months ago

trungpx commented 7 months ago

Hi authors,

I am trying to install Adan with the described command: "python3 -m pip install git+https://github.com/sail-sg/Adan.git", however, I couldn't install it due to the error below, I checked and already saw that torch is installed and worked, do you have any suggestion to install it? I have no idea how to fix this error.

Building wheels for collected packages: adan Building wheel for adan (setup.py) ... error error: subprocess-exited-with-error × python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [54 lines of output] running bdist_wheel /root/miniconda3/envs/neurips24/lib/python3.8/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend. warnings.warn(msg.format('we could not find ninja.')) running build running build_py creating build creating build/lib.linux-x86_64-cpython-38 copying adan.py -> build/lib.linux-x86_64-cpython-38 running build_ext Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/tmp/pip-req-build-wcs6gasc/setup.py", line 20, in setup( File "/root/miniconda3/envs/xxx/lib/python3.8/site-packages/setuptools/init.py", line 103, in setup return distutils.core.setup(attrs) File "/root/miniconda3/envs/xxx/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup return run_commands(dist) ... raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda)) RuntimeError: The detected CUDA version (12.2) mismatches the version that was used to compile PyTorch (11.8). Please make sure to use the same CUDA versions. [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for adan Running setup.py clean for adan Failed to build adan ERROR: Could not build wheels for adan, which is required to install pyproject.toml-based projects**

Check with the torch, it is ok: image

XingyuXie commented 7 months ago

@trungpx

It seems that you are using the torch in the Conda while the global nvcc is 11.8. You may identify the path to your 12.2 CUDA before compiling Adan.

export CUDA_HOME=/home/xyxie/miniconda3/envs/xxx   ###path to your env which has installed CUDA.
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
trungpx commented 7 months ago

Thanks for your support. I did not expect that you had a quick response so I searched several hours and have reinstalled the torch so that it matches Cuda 12.2 and it works now. I will try your solution once I have another version of the torch later.