AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
MIT License
4.48k stars 484 forks source link

Fail to Install from source when using docker #283

Closed jackaihfia2334 closed 7 months ago

jackaihfia2334 commented 1 year ago

root@docker-desktop:/data1/llm/code/AutoGPTQ# pip install . Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Processing /data1/llm/code/AutoGPTQ Preparing metadata (setup.py) ... done Requirement already satisfied: accelerate>=0.19.0 in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (0.21.0) Requirement already satisfied: datasets in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (2.14.4) Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (1.24.4) Collecting rouge (from auto-gptq==0.4.1+cu1211009) Downloading rouge-1.0.1-py3-none-any.whl (13 kB) Requirement already satisfied: torch>=1.13.0 in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (2.1.0a0+b5021ba) Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (0.3.1) Requirement already satisfied: transformers>=4.31.0 in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (4.31.0) Requirement already satisfied: peft in /usr/local/lib/python3.10/dist-packages (from auto-gptq==0.4.1+cu1211009) (0.4.0) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.19.0->auto-gptq==0.4.1+cu1211009) (23.1) Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.19.0->auto-gptq==0.4.1+cu1211009) (5.9.4) Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.19.0->auto-gptq==0.4.1+cu1211009) (6.0) Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (3.12.2) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (4.7.1) Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (1.12) Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (2.6.3) Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (3.1.2) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (2023.6.0) Requirement already satisfied: huggingface-hub<1.0,>=0.14.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (0.16.4) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (2023.6.3) Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (2.31.0) Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (0.13.3) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (4.65.0) Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (11.0.0) Requirement already satisfied: dill<0.3.8,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (0.3.7) Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (2.0.3) Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (3.3.0) Requirement already satisfied: multiprocess in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (0.70.15) Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets->auto-gptq==0.4.1+cu1211009) (3.8.4) Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from rouge->auto-gptq==0.4.1+cu1211009) (1.16.0) Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (23.1.0) Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (3.1.0) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (6.0.4) Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (4.0.2) Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (1.9.2) Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (1.3.3) Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets->auto-gptq==0.4.1+cu1211009) (1.3.1) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (1.26.16) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers>=4.31.0->auto-gptq==0.4.1+cu1211009) (2023.5.7) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (2.1.3) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets->auto-gptq==0.4.1+cu1211009) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets->auto-gptq==0.4.1+cu1211009) (2023.3) Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets->auto-gptq==0.4.1+cu1211009) (2023.3) Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.13.0->auto-gptq==0.4.1+cu1211009) (1.3.0) Building wheels for collected packages: auto-gptq Building wheel for auto-gptq (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [219 lines of output] conda_cuda_include_dir /usr/lib/python3/dist-packages/nvidia/cuda_runtime/include running bdist_wheel running build running build_py creating build creating build/lib.linux-x86_64-3.10 creating build/lib.linux-x86_64-3.10/auto_gptq copying auto_gptq/init.py -> build/lib.linux-x86_64-3.10/auto_gptq creating build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks copying auto_gptq/eval_tasks/language_modeling_task.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks copying auto_gptq/eval_tasks/sequence_classification_task.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks copying auto_gptq/eval_tasks/text_summarization_task.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks copying auto_gptq/eval_tasks/_base.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks copying auto_gptq/eval_tasks/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks creating build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/auto.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/baichuan.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/bloom.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/codegen.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/gpt2.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/gptj.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/gpt_bigcode.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/gpt_neox.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/internlm.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/llama.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/moss.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/opt.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/qwen.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/rw.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/_base.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/_const.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling copying auto_gptq/modeling/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/modeling creating build/lib.linux-x86_64-3.10/auto_gptq/nn_modules copying auto_gptq/nn_modules/fused_gptj_attn.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules copying auto_gptq/nn_modules/fused_llama_attn.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules copying auto_gptq/nn_modules/fused_llama_mlp.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules copying auto_gptq/nn_modules/_fused_base.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules copying auto_gptq/nn_modules/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules creating build/lib.linux-x86_64-3.10/auto_gptq/quantization copying auto_gptq/quantization/gptq.py -> build/lib.linux-x86_64-3.10/auto_gptq/quantization copying auto_gptq/quantization/quantizer.py -> build/lib.linux-x86_64-3.10/auto_gptq/quantization copying auto_gptq/quantization/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/quantization creating build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/data_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/exllama_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/import_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/peft_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/perplexity_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils copying auto_gptq/utils/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/utils creating build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks/_utils copying auto_gptq/eval_tasks/_utils/classification_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks/_utils copying auto_gptq/eval_tasks/_utils/generation_utils.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks/_utils copying auto_gptq/eval_tasks/_utils/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/eval_tasks/_utils creating build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear copying auto_gptq/nn_modules/qlinear/qlinear_cuda.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear copying auto_gptq/nn_modules/qlinear/qlinear_cuda_old.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear copying auto_gptq/nn_modules/qlinear/qlinear_exllama.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear copying auto_gptq/nn_modules/qlinear/qlinear_triton.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear copying auto_gptq/nn_modules/qlinear/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/qlinear creating build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/triton_utils copying auto_gptq/nn_modules/triton_utils/custom_autotune.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/triton_utils copying auto_gptq/nn_modules/triton_utils/kernels.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/triton_utils copying auto_gptq/nn_modules/triton_utils/mixin.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/triton_utils copying auto_gptq/nn_modules/triton_utils/init.py -> build/lib.linux-x86_64-3.10/auto_gptq/nn_modules/triton_utils running build_ext building 'autogptq_cuda_64' extension creating /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10 creating /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda Emitting ninja build file /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] c++ -MMD -MF /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda/autogptq_cuda_64.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/data1/llm/code/AutoGPTQ/autogptq_cuda -I/usr/include/python3.10 -c -c /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_64.cpp -o /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda/autogptq_cuda_64.o -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=autogptq_cuda_64 -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17 [2/2] /usr/local/cuda/bin/nvcc -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/data1/llm/code/AutoGPTQ/autogptq_cuda -I/usr/include/python3.10 -c -c /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu -o /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda/autogptq_cuda_kernel_64.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=autogptq_cuda_64 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 FAILED: /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda/autogptq_cuda_kernel_64.o /usr/local/cuda/bin/nvcc -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/data1/llm/code/AutoGPTQ/autogptq_cuda -I/usr/include/python3.10 -c -c /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu -o /data1/llm/code/AutoGPTQ/build/temp.linux-x86_64-3.10/autogptq_cuda/autogptq_cuda_kernel_64.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=autogptq_cuda_64 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -std=c++17 /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(62): error: no suitable conversion function from "__half_raw" to "int" exists half tmpres = __hadd(hsum, val); ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1167): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp >> 0) & 0xf][off], scale, zero), blockvec[k + 0], res2);
                       ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1167): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp >> 0) & 0xf][off], scale, zero), blockvec[k + 0], res2);
               ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1301): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp1 >> 0) & 0x3f][off], scale, zero), blockvec[k + 0], res2);
                       ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1301): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp1 >> 0) & 0x3f][off], scale, zero), blockvec[k + 0], res2);
               ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1419): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp >> 0) & 0xff][off], scale, zero), blockvec[k + 0], res2);
                       ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1419): error: identifier "__hfma2" is undefined
        res2 = __hfma2(__hfma2(deq2[(tmp >> 0) & 0xff][off], scale, zero), blockvec[k + 0], res2);
               ^

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(332): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
        atomicAdd(&mul[b * width + w], res);
        ^
            detected during instantiation of "void VecQuant2MatMulKernel(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=double]" at line 270

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(477): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
        atomicAdd(&mul[b * width + w], res);
        ^
            detected during instantiation of "void VecQuant3MatMulKernel(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=double]" at line 357

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(565): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
        atomicAdd(&mul[b * width + w], res);
        ^
            detected during instantiation of "void VecQuant4MatMulKernel(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=double]" at line 502

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(652): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
        atomicAdd(&mul[b * width + w], res);
        ^
            detected during instantiation of "void VecQuant8MatMulKernel(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=double]" at line 590

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(750): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
      atomicAdd(&mul[b * width + w], res);
      ^
            detected during instantiation of "void VecQuant2MatMulKernel_old(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, int, int, int, int, int, int) [with scalar_t=double]" at line 679

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(909): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
      atomicAdd(&mul[b * width + w], res);
      ^
            detected during instantiation of "void VecQuant3MatMulKernel_old(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, int, int, int, int, int, int) [with scalar_t=double]" at line 774

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(996): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
      atomicAdd(&mul[b * width + w], res);
      ^
            detected during instantiation of "void VecQuant4MatMulKernel_old(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, int, int, int, int, int, int) [with scalar_t=double]" at line 933

  /data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu(1079): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (double *, double)
      atomicAdd(&mul[b * width + w], res);
      ^
            detected during instantiation of "void VecQuant8MatMulKernel_old(const scalar_t *, const int *, scalar_t *, const scalar_t *, const int *, int, int, int, int, int, int) [with scalar_t=double]" at line 1020

  15 errors detected in the compilation of "/data1/llm/code/AutoGPTQ/autogptq_cuda/autogptq_cuda_kernel_64.cu".
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1902, in _run_ninja_build
      subprocess.run(
    File "/usr/lib/python3.10/subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/data1/llm/code/AutoGPTQ/setup.py", line 147, in <module>
      setup(
    File "/usr/local/lib/python3.10/dist-packages/setuptools/__init__.py", line 107, in setup
      return distutils.core.setup(**attrs)
    File "/usr/lib/python3.10/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/usr/lib/python3.10/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/wheel/bdist_wheel.py", line 343, in run
      self.run_command("build")
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/lib/python3.10/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/usr/lib/python3.10/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1234, in run_command
      super().run_command(command)
    File "/usr/lib/python3.10/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 84, in run
      _build_ext.run(self)
    File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 186, in run
      _build_ext.build_ext.run(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 340, in run
      self.build_extensions()
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 848, in build_extensions
      build_ext.build_extensions(self)
    File "/usr/local/lib/python3.10/dist-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
      _build_ext.build_ext.build_extensions(self)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 449, in build_extensions
      self._build_extensions_serial()
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 474, in _build_extensions_serial
      self.build_extension(ext)
    File "/usr/local/lib/python3.10/dist-packages/setuptools/command/build_ext.py", line 246, in build_extension
      _build_ext.build_extension(self, ext)
    File "/usr/lib/python3.10/distutils/command/build_ext.py", line 529, in build_extension
      objects = self.compiler.compile(sources,
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 661, in unix_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1575, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "/usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py", line 1918, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for auto-gptq Running setup.py clean for auto-gptq Failed to build auto-gptq ERROR: Could not build wheels for auto-gptq, which is required to install pyproject.toml-based projects

PanQiWei commented 1 year ago

Hi, do you have c++ installed in your computer?

fxmarty commented 1 year ago

Looks similar: https://github.com/PanQiWei/AutoGPTQ/issues/194

jackaihfia2334 commented 1 year ago

Hi, do you have c++ installed in your computer?

of course , I think it seems a problem with docker


1692892301684_8A510AC3-A0C4-471a-BC7B-F0C0734B0784
pranjali97 commented 1 year ago

Hi! I am getting the same error since yesterday as well. Worked fine until then.

jackaihfia2334 commented 1 year ago

Hi! I am getting the same error since yesterday as well. Worked fine until then.

can you tell me how to fix this error

howardgriffin commented 1 year ago

Same problem, I am using cuda12.0. How to solve this problem?

fxmarty commented 1 year ago

Hi @howardgriffin Are you using docker? What is you base image?

@jackaihfia2334 Could you share your dockerfile?

ZZBoom commented 9 months ago

docker image : nvcr.io/nvidia/pytorch:23.04-py3

install Failed while run the command:

/usr/local/cuda/bin/nvcc -I/usr/local/lib/python3.8/dist-packages/torch/include -I/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.8/dist-packages/torch/include/TH -I/usr/local/lib/python3.8/dist-packages/torch/include/THC -I/usr/local/cuda/inc lude -I/tmp/pip-req-build-6a2ov6bj/autogptq_cuda -I/usr/include/python3.8 -c -c autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu -o /tmp/pip-req-build-6a2ov6bj/build/temp.linux-x86_64-3.8/autogptq_extension/cuda_64/autogptq_cuda_kernel_64.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERA TORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1013"' -DTORCH_EXTENSION_NAME=autogptq_cuda_64 -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=comput e_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -ccbin g++ -std=c++17

ERROR INFO:

`autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(62): error: no suitable conversion function from "__half_raw" to "int" exists half tmpres = __hadd(hsum, val); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1169): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp >> 0) & 0xf][off], scale, zero), blockvec[k + 0], res2); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1169): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp >> 0) & 0xf][off], scale, zero), blockvec[k + 0], res2); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1303): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp1 >> 0) & 0x3f][off], scale, zero), blockvec[k + 0], res2); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1303): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp1 >> 0) & 0x3f][off], scale, zero), blockvec[k + 0], res2); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1421): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp >> 0) & 0xff][off], scale, zero), blockvec[k + 0], res2); ^

autogptq_extension/cuda_64/autogptq_cuda_kernel_64.cu(1421): error: identifier "hfma2" is undefined res2 = hfma2(__hfma2(deq2[(tmp >> 0) & 0xff][off], scale, zero), blockvec[k + 0], res2); ^ ............`

1100111GTH commented 7 months ago

Same

fxmarty commented 7 months ago

@jackaihfia2334 @pranjali97 @jackaihfia2334 @howardgriffin @ZZBoom -gencode=arch=compute_52,code=sm_52 (Maxwell) is not supported.

I merged https://github.com/AutoGPTQ/AutoGPTQ/pull/622 that hints you to use the environment variable TORCH_CUDA_ARCH_LIST to fix the error (see https://github.com/pytorch/pytorch/blob/v2.2.2/setup.py#L135-L139).