amaiya / onprem

A tool for running on-premises large language models with non-public data
https://amaiya.github.io/onprem
Apache License 2.0
684 stars 32 forks source link

CPU limits #48

Closed lystrata closed 9 months ago

lystrata commented 9 months ago

On an 8 Core system, I see that onprem pegs 4 coress @100% while the 0ther 4 cores are relatively inactive. Is this an intentional design limit? Or can we enable the code to make use of more cores?

amaiya commented 9 months ago

The default is to use half of your CPUs and is set by llama-cpp-python. You can change it by supplying the n_threads parameter to LLM:

from onprem import LLM
llm = LLM(n_threads=desired_number_of_cores)