CUDA extension compilation error: Unsupported GPU architecture 'compute_86'

Description:

I am encountering an error while trying to run the 01_6B_chat.sh script. The error message indicates that the nvcc compiler does not support the GPU architecture 'compute_86'. Below are the details of the error and my environment.

Installed CUDA version 11.0 does not match the version torch was compiled with 11.8 but since the APIs are compatible, accepting this combination
Using /home/bizon/.cache/torch_extensions/py310_cu118 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/bizon/.cache/torch_extensions/py310_cu118/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output custom_cuda_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/TH -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/bizon/anaconda3/envs/templora/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 -DBF16_AVAILABLE -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -c /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/deepspeed/ops/csrc/common/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o 
FAILED: custom_cuda_kernel.cuda.o 
/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output custom_cuda_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/TH -isystem /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/bizon/anaconda3/envs/templora/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 -DBF16_AVAILABLE -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -c /home/bizon/anaconda3/envs/templora/lib/python3.10/site-packages/deepspeed/ops/csrc/common/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o 
nvcc fatal   : Unsupported gpu architecture 'compute_86'
ninja: build stopped: subcommand failed.

Environment:

CUDA version: 11.0 PyTorch version: Compiled with CUDA 11.8 DeepSpeed version: 0.13.4 Python version: 3.10

TemporaryLoRA / Temp-LoRA

CUDA extension compilation error: Unsupported GPU architecture 'compute_86' #3