Closed rihp closed 4 months ago
I'm using something like
from langchain.llms import Ollama
def ask_mistral(question, num_predict=2768, k=25, timeout=20):
ollama = Ollama(base_url='http://localhost:11434', model="mistral", num_predict=num_predict)
response = ollama(question)
return question
but have it so that it retries after 20 seconds, restarting the ollama server by killing the process (that automatically spawns a new ollama instance 2 seconds later) that is working very rarely though.
updating ollama seems to decrease the frequency of this issue?
curl https://ollama.ai/install.sh | sh
@rihp this should have fixed the issue: https://github.com/ollama/ollama/pull/2459
Are you using v0.1.25 of ollama?
Upgrading to 0.1.25 lowered the frequency of the issue, i'll share logs later this week!
This repo is for the python client library. For server issues, please see ollama/ollama
My ollama server hangs constantly, as in takes in queries, my gpu makes noise, but doesnt respond back in the jupyter environment unless i restart the ollama process a couple of times, any idea on how to debug what might be making it just hang thiking ? I’m on linux using vs code insider version
Ive set it so that after 20 seconds it restarts the ollama server, it works like 10% of the times though and its very time consuming
Here is the
systemctl status ollama.service