turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.75k stars 215 forks source link

File not found when compiling exllama_ext #145

Closed Flameish closed 1 year ago

Flameish commented 1 year ago

OS: Fedora 38 GPU: RX 6700 XT (rocm)

When trying to run the webui, the compiling of exllama_ext fails with multiple files not found. The second error is repeated 10 times, which I left out of the log.

gcc, g++, cmake, libstdc++ etc. are installed.


(base) [myuser@cruiser exllama]$ python test_benchmark_inference.py -d /home/myuser/Downloads/models/TheBloke_WizardLM-7B-uncensored-GPTQ -p -ppl
Successfully preprocessed all matching files.
Traceback (most recent call last):
  File "/home/myuser/miniconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/home/myuser/miniconda3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/myuser/Projects/llm/exllama/test_benchmark_inference.py", line 1, in <module>
    from model import ExLlama, ExLlamaCache, ExLlamaConfig
  File "/home/myuser/Projects/llm/exllama/model.py", line 12, in <module>
    import cuda_ext
  File "/home/myuser/Projects/llm/exllama/cuda_ext.py", line 43, in <module>
    exllama_ext = load(
  File "/home/myuser/miniconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/home/myuser/miniconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1509, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/myuser/miniconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1624, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/myuser/miniconda3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1909, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'exllama_ext': [1/12] c++ -MMD -MF rep_penalty.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/myuser/Projects/llm/exllama/exllama_ext -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THC -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THH -isystem /usr/include -isystem /usr/miopen/include -isystem /usr/hip/include -isystem /home/myuser/miniconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/myuser/Projects/llm/exllama/exllama_ext/cpu_func/rep_penalty.cpp -o rep_penalty.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
FAILED: rep_penalty.o 
c++ -MMD -MF rep_penalty.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/myuser/Projects/llm/exllama/exllama_ext -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THC -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THH -isystem /usr/include -isystem /usr/miopen/include -isystem /usr/hip/include -isystem /home/myuser/miniconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c /home/myuser/Projects/llm/exllama/exllama_ext/cpu_func/rep_penalty.cpp -o rep_penalty.o -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1
In file included from /home/myuser/Projects/llm/exllama/exllama_ext/cpu_func/rep_penalty.cpp:2:
/usr/include/c++/13/cstdlib:79:15: fatal error: stdlib.h: No such file or directory
   79 | #include_next <stdlib.h>
      |               ^~~~~~~~~~
compilation terminated.

[3/12] /usr/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/myuser/Projects/llm/exllama/exllama_ext -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THC -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THH -isystem /usr/include -isystem /usr/miopen/include -isystem /usr/hip/include -isystem /home/myuser/miniconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /home/myuser/Projects/llm/exllama/exllama_ext/hip_func/q4_matrix.hip -o q4_matrix.cuda.o 
FAILED: q4_matrix.cuda.o 
/usr/bin/hipcc  -DWITH_HIP -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/myuser/Projects/llm/exllama/exllama_ext -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/TH -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THC -isystem /home/myuser/miniconda3/lib/python3.10/site-packages/torch/include/THH -isystem /usr/include -isystem /usr/miopen/include -isystem /usr/hip/include -isystem /home/myuser/miniconda3/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -fPIC -D__HIP_PLATFORM_HCC__=1 -DUSE_ROCM=1 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -lineinfo -U__HIP_NO_HALF_CONVERSIONS__ -O3 --amdgpu-target=gfx900 --amdgpu-target=gfx906 --amdgpu-target=gfx908 --amdgpu-target=gfx90a --amdgpu-target=gfx1030 -fno-gpu-rdc -c /home/myuser/Projects/llm/exllama/exllama_ext/hip_func/q4_matrix.hip -o q4_matrix.cuda.o 
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang-16: warning: -lineinfo: 'linker' input unused [-Wunused-command-line-argument]
In file included from <built-in>:1:
In file included from /usr/lib64/clang/16/include/__clang_hip_runtime_wrapper.h:50:
In file included from /usr/lib64/clang/16/include/cuda_wrappers/cmath:27:
/usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/cmath:47:15: fatal error: 'math.h' file not found
#include_next <math.h>
              ^~~~~~~~
1 error generated when compiling for gfx1030.
nivibilla commented 1 year ago

You could try the pip install version !pip install git+https://github.com/jllllll/exllama

turboderp commented 1 year ago

Well, you're missing C headers, or the include path is configured wrong. I'm not really sure where they're supposed to go in Fedora?

turboderp commented 1 year ago

Has this been resolved?