RuntimeError: Error building extension 'lightseq_layers_new'

xiaotingyun commented 2 years ago

使用 python lightseq/examples/inference/python/export/huggingface/hf_bart_export.py时报错 RuntimeError: Error building extension 'lightseq_layers_new'

pytorch版本1.12.1，cuda10.2，cudnn8.4.3

Taka152 commented 2 years ago

The master branch is unstable, I recommend trying a release tag and following the README to try our example, especially notice to changing the working dir.

xiaotingyun commented 2 years ago

使用的是发行版本3.0.0

Taka152 commented 2 years ago

3.0.0 may require cuda 11, you could checkout 2.2.1 version and try.

dulante00 commented 2 years ago

3.0.0 may require cuda 11, you could checkout 2.2.1 version and try.

I try cuda11 and pytorch 1.11.0 , python lightseq/examples/inference/python/export/huggingface/hf_bert_export.py (tag 3.0.1), the problem still occure everytime , please annotation clearly the tag or realease version depency on which env, thanks

xiaotingyun commented 2 years ago

3.0.0 may require cuda 11, you could checkout 2.2.1 version and try.

I try cuda11 and pytorch 1.11.0 , python lightseq/examples/inference/python/export/huggingface/hf_bert_export.py (tag 3.0.1), the problem still occure everytime , please annotation clearly the tag or realease version depency on which env, thanks

Hello, I have successfully accelerated on pytorch1.12, cuda10.2 and lightseq2.2.0, you can try it

dulante00 commented 2 years ago

3.0.0 may require cuda 11, you could checkout 2.2.1 version and try.

I try cuda11 and pytorch 1.11.0 , python lightseq/examples/inference/python/export/huggingface/hf_bert_export.py (tag 3.0.1), the problem still occure everytime , please annotation clearly the tag or realease version depency on which env, thanks

Hello, I have successfully accelerated on pytorch1.12, cuda10.2 and lightseq2.2.0, you can try it

thanks, your bundle envs is ok, I get it

nghuyong commented 1 year ago

3.0.0 may require cuda 11, you could checkout 2.2.1 version and try.

I try cuda11 and pytorch 1.11.0 , python lightseq/examples/inference/python/export/huggingface/hf_bert_export.py (tag 3.0.1), the problem still occure everytime , please annotation clearly the tag or realease version depency on which env, thanks

I have the same problem, so, in cuda11 and lightseq3.0, what should be the right env to run hf_bert_export.py, including: torch, tensorflow, torchvision, etc. These examples may better provide a requirements.txt, Thanks a lot. @Taka152

etoilestar commented 1 year ago

hello，I have the same problem when I was running gpt2, the cuda version is 11.6

520jefferson commented 1 year ago

worked, thx

jimmieliu commented 1 year ago

cuda 11.6 torch1.11 same problem

[7/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=lightseq_layers_new -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/ops_new/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/lsflow/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/layers_new/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c /opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu -o cuda_util.cuda.o FAILED: cuda_util.cuda.o /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=lightseq_layers_new -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/ops_new/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/lsflow/includes -I/opt/conda/lib/python3.8/site-packages/lightseq/csrc/layers_new/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c /opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu -o cuda_util.cuda.o /opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu(218): error: identifier "__hisnan" is undefined

1 error detected in the compilation of "/opt/conda/lib/python3.8/site-packages/lightseq/csrc/kernels/cuda_util.cu". ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1726, in _run_ninja_build subprocess.run( File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "test.py", line 4, in from lightseq.training.ops.pytorch.quant_linear_layer import LSQuantLinearLayer File "/opt/conda/lib/python3.8/site-packages/lightseq/training/init.py", line 1, in from lightseq.training.ops.pytorch.transformer_embedding_layer import ( File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/init.py", line 11, in layer_cuda_module = LayerBuilder().load() File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/builder/builder.py", line 203, in load return self.jit_load(verbose) File "/opt/conda/lib/python3.8/site-packages/lightseq/training/ops/pytorch/builder/builder.py", line 231, in jit_load op_module = load( File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1130, in load return _jit_compile( File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1343, in _jit_compile _write_ninja_file_and_build_library( File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1455, in _write_ninja_file_and_build_library _run_ninja_build( File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1742, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'lightseq_layers_new'

bytedance / lightseq

RuntimeError: Error building extension 'lightseq_layers_new' #406