turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.67k stars 214 forks source link

Tried to build setup exllama but encountering ninja related errors, can someone please help me? #258

Open BwandoWando opened 10 months ago

BwandoWando commented 10 months ago

Hello everyone

Im trying to setup exllama in an Azure ML compute and I followed the instructions here https://github.com/turboderp/exllama, but unfortunately Im getting an error when trying to call this as based from the setup instructions.

python test_benchmark_inference.py -d <path_to_model_files> -p -ppl

I've been trying to fix the error, but unfortunately, I wasnt able to. I hope someone can point me to the right direction.

Here are some of the parts of the error message,but the complete error is much, much longer

Thank you and looking forward to fix this issue.

Traceback (most recent call last):
  File "/anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1893, in _run_ninja_build
    subprocess.run(
  File "/anaconda/envs/exllamav2/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

...

RuntimeError: Error building extension 'exllama_ext': [1/12] c++ -MMD -MF exllama_ext.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/batch/tasks/shared/LS_root/mounts/clusters/vm-nc48ads-a100-v4/code/Users/xxxxxxx.xxxxxxx/Sprint 114/exllama/exllama_ext -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/TH -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /anaconda/envs/exllamav2/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c '/mnt/batch/tasks/shared/LS_root/mounts/clusters/vm-nc48ads-a100-v4/code/Users/xxxxxxx.xxxxxxx/Sprint 114/exllama/exllama_ext/exllama_ext.cpp' -o exllama_ext.o 
FAILED: exllama_ext.o 
c++ -MMD -MF exllama_ext.o.d -DTORCH_EXTENSION_NAME=exllama_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/mnt/batch/tasks/shared/LS_root/mounts/clusters/vm-nc48ads-a100-v4/code/Users/xxxxxxx.xxxxxxx/Sprint 114/exllama/exllama_ext -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/TH -isystem /anaconda/envs/exllamav2/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /anaconda/envs/exllamav2/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -c '/mnt/batch/tasks/shared/LS_root/mounts/clusters/vm-nc48ads-a100-v4/code/Users/xxxxxxx.xxxxxxx/Sprint 114/exllama/exllama_ext/exllama_ext.cpp' -o exllama_ext.o 
c++: error: 114/exllama/exllama_ext: No such file or directory

Here are the compute's details: https://learn.microsoft.com/en-us/azure/virtual-machines/nc-a100-v4-series

Notes:

evg-tyurin commented 10 months ago

There is space char in the path to exllama_ext

guialfaro053 commented 9 months ago

Try fixing the header files for python. This helped for me: here

qcapista commented 9 months ago

@BwandoWando did you find a workaround?