Open Subarasheese opened 1 year ago
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
Sources:
Did this managed to get solved? really wanting to use LocalAI for GPU models (AWQ and GPTQ) but am having 0 luck..
Same
same here
same
same
Same issue here. Any resolution?
Has it been resolved
Please add logs with LocalAI running with the debug flag ( --debug )
Thank you!
I had this issue as well until now.
I did run
DEBUG=true local-ai
after install and found that the GRPC call fails because of: ImportError libcudart.so.12 cannot open shared object file No such file
which hints that I was missing apt install cuda-toolkit
(even though I had apt install nvidia-cuda-toolkit
which installed the ubuntu cuda11..? And nvidia-smi and everything was working already.
I now get the following error on Nvidia A100:
tensor 'token_embd.weight' (q4_0) (and 0 others) cannot be used with preferred buffer type CUDA_Host, using CPU instead
which seems to be not that bad:
https://github.com/LostRuins/koboldcpp/issues/1223
But when running in docker, I still get the error:
user@server:/home/user$ docker run -e DEBUG=True -p 8080:8080 --name local-ai -ti -v /raid/localai/models:/build/models localai/localai:latest-aio-gpu-nvidia-cuda-12
DBG GRPC(Meta Llama 3.1 70B Instruct-127.0.0.1:38389): stderr llama-cpp-fallback: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
LocalAI version: Commit 2bacd0180d409b2b8f5c6f1b1ef13ccfda108c48
Environment, CPU architecture, OS, and Version:
CPU Architecture: x86_64, OS: Arch Linux, Version: 6.3.8-arch1-1
Describe the bug I followed the instructions here:
https://localai.io/model-compatibility/exllama/
And got this as the output of the curl:
To Reproduce
Fresh install, then following the instructions of either those:
https://localai.io/model-compatibility/exllama/
https://localai.io/model-compatibility/autogptq/
Expected behavior The LLM output.
Logs
Additional context
My models dir looks like this:
exllama.yaml looks like this: