BUILD ERROR with nvcc - Githubissues

tohneecao commented 1 year ago

when i run ds_inference.py, get this error:

Emitting ninja build file /root/.cache/torch_extensions/py37_cu117/transformer_inference/build.ninja... Building extension module transformer_inference... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/5] /usr/local/cuda-12.1/bin/nvcc -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/usr/lib/python3.7/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/usr/lib/python3.7/site-packages/deepspeed/ops/csrc/includes -isystem /usr/lib/python3.7/site-packages/torch/include -isystem /usr/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /usr/lib/python3.7/site-packages/torch/include/TH -isystem /usr/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-12.1/include -isystem /usr/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_80,code=compute_80 -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' -std=c++14 -c /usr/lib/python3.7/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/apply_rotary_pos_emb.cu -o apply_rotary_pos_emb.cuda.o FAILED: apply_rotary_pos_emb.cuda.o

why nvcc using DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ to compile, which generate error.

how to close this

mayank31398 commented 1 year ago

Can you share exactly how you are running it?

tohneecao commented 1 year ago

The problem maybe cuda version, my CUDA version is 12.1, while torch version 1.13.1+cu117。Change my cuda verison to 11.7，same as torch，problem solved.

In anther way： The error happens when -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALFCONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ are using when compile，which determined in ./usr/lib/python3.7/site-packages/torch/utils/cpp_extension.py

when i change the macro definition，also worked.

mayank31398 commented 1 year ago

closing this

huggingface / transformers-bloom-inference

BUILD ERROR with nvcc #81