Open kroryan opened 1 week ago
I do have an idea. Indeed, I am using Tokyo and do not use worker threads. I never needed more cores for the few servers this bot is on as the heavy lifting is done by Ollama.
You may need to enable a feature in Tokio (I guess multi-thread
) and replace the main macro with:
#[tokio::main(flavor = "multi_thread", worker_threads = 8)]
unfortunably doesnt look to work :(
thanks so much for asnwering me,
i will seize the opportunity to ask you something im not really sure how to do, how can i change the model the bot is using? sorry about this las question
You just have to change the model in the modelfile ;)
unfortunably doesnt look to work :(
Do you experience performance issue?
unfortunably doesnt look to work :(
Do you experience performance issue?
its not like it is not working properly is like it is using half of the cores so takes longer to load the answer, i dont use a very powerfull hardarware (orange pi 5 32 gb) so instead taking 100 seconds it takes 200
This is purely a front-end to Ollama. It is not resource intensive.
The part you need to optimize (more core, accelerator, whatever) is Ollama.
This is purely a front-end to Ollama. It is not resource intensive.
The part you need to optimize (more core, accelerator, whatever) is Ollama.
but as i said before it only happens with this bot i have other bots and are working fine with ollama
Which network do they use? Try something smaller than a 7Bq4 network. That is what causes the difference.
so when i use this bot it only uses the half of the cores, with other bot i have it use all, any idea why?
thanks for the bot