3 errors detected in the compilation of "src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu"

kkaarrss commented 11 months ago

I am getting the following errors: /usr/local/cuda/bin/nvcc -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu -o build/temp.linux-x86_64-3.10/src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=quant_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -std=c++17 /usr/local/lib/python3.10/dist-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero detected during: instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator==(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" (61): here instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, >::operator!=(const c10::detail::integer_iterator<I, one_sided, > &) const [with I=size_t, one_sided=false, =0]" /usr/local/lib/python3.10/dist-packages/torch/include/c10/core/TensorImpl.h(77): here

  /usr/local/lib/python3.10/dist-packages/torch/include/c10/util/irange.h(54): warning #186-D: pointless comparison of unsigned integer with zero
            detected during:
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator==(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  (61): here
              instantiation of "__nv_bool c10::detail::integer_iterator<I, one_sided, <unnamed>>::operator!=(const c10::detail::integer_iterator<I, one_sided, <unnamed>> &) const [with I=std::size_t, one_sided=true, <unnamed>=0]"
  /usr/local/lib/python3.10/dist-packages/torch/include/ATen/core/qualified_name.h(73): here

  src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu(974): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (__half *, c10::Half)
            detected during instantiation of "void VecQuant4MatMulKernelFaster(const half2 *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=c10::Half]"
  (895): here

  src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu(1301): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (__half *, c10::Half)
            detected during instantiation of "void VecQuant4MatMulV1KernelFaster(const half2 *, const int *, scalar_t *, const scalar_t *, const scalar_t *, int, int, int, int) [with scalar_t=c10::Half]"
  (1323): here

  src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu(1566): error: no instance of overloaded function "atomicAdd" matches the argument list
              argument types are: (__half *, c10::Half)
            detected during instantiation of "void VecQuant4MatMulKernel_G(const half2 *, const int *, scalar_t *, const scalar_t *, const int *, const int *, int, int, int, int, int) [with scalar_t=c10::Half]"
  (1473): here

  3 errors detected in the compilation of "src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu".
  error: command '/usr/local/cuda/bin/nvcc' failed with exit code 1
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

I am using: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_May__3_18:49:52_PDT_2022 Cuda compilation tools, release 11.7, V11.7.64 Build cuda_11.7.r11.7/compiler.31294372_0

Before I had a newer version but that failed because it said I needed to install this version.

johnsmith0031 commented 11 months ago

For old cards, you can run

pip uninstall alpaca_lora_4bit
pip install git+https://github.com/johnsmith0031/alpaca_lora_4bit.git@old_compatible

kkaarrss commented 11 months ago

Thanks for the quick reply, that resolved it! I am indeed running an old Tesla P40 and Tesla P4.

johnsmith0031 / alpaca_lora_4bit

3 errors detected in the compilation of "src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu" #150