llama-cpp CPU 1500%，Very slow...What to do?

mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference

https://localai.io

MIT License

23.76k stars 1.82k forks source link

llama-cpp CPU 1500%，Very slow...What to do? #2066

Open fairyqb opened 5 months ago

fairyqb commented 5 months ago

llama-cpp CPU 1500%，Very slow...

my server:centos,20 core, 32GB memory

localai-bot commented 5 months ago

Hello fairyqb,

Thank you for bringing this to our attention. It seems like you are experiencing high CPU usage with llama-cpp on your server. We understand this can be a concern and would like to help. Please provide us with more information regarding your LocalAI configuration and any commands used to run llama-cpp. It would also be helpful to get the output of the server load and the top

atljoseph commented 5 months ago

Do you have a gpu? If so, I’m seeing the same issue, on docker, at least. https://github.com/mudler/LocalAI/discussions/2191