c0sogi / llama-api

An OpenAI-like LLaMA inference API
MIT License
111 stars 9 forks source link

High RAM and CPU usage #27

Open delta-whiplash opened 8 months ago

delta-whiplash commented 8 months ago

image When I run a model on my GPU, my CPU and RAM Usage is insanely high