turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.19k stars 234 forks source link

undefined symbol: _ZN3c104cuda9SetDeviceEi #457

Open icivi opened 1 month ago

icivi commented 1 month ago

----> 4 gptq_model = exllama_set_max_input_length(gptq_model, max_input_length=7504)

/usr/local/lib/python3.10/dist-packages/auto_gptq/utils/exllama_utils.py in exllama_set_max_input_length(model, max_input_length) 15 16 # The import is set here to avoid a global import. Arguably this is quite ugly, it would be better to have lazy loading. ---> 17 from exllama_kernels import cleanup_buffers_cuda, prepare_buffers 18 19 if not model.quantize_config.desc_act:

ImportError: /usr/local/lib/python3.10/dist-packages/exllama_kernels.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi

turboderp commented 1 month ago

You're using a prebuilt wheel that doesn't match your PyTorch version. The 0.0.21 wheels are built for PyTorch 2.3.0, and I've included some 2.2.0 versions for Windows. If you're on a different version you'll have to install the JIT version or build from source (either option needs the CUDA Toolkit to be installed.)