Closed bshor closed 1 month ago
No, I don't think. Looking through the API docs I don't see a way to kill a process. Even by sending a new request, I think the previous one will just complete before Ollama starts working on the new one. I want to have a look at the load balancing soon (#17 ) so it might be possible to have several instances of Ollama running (this would probably only make sense if you have several computers). But since several requests would be sent to Ollama at once, it might speed up the process anyway.
I'm running the latest version of rollama, using "llama3:instruct" and "mistral3:instruct" models (fully updated). Ollama itself is fully updated and running on Fedora 39 Linux. R is 4.3.3. Machine is a Ryzen 7950X3D with an Nvidia 4070 GPU.
Rollama is awesome and works very well. Thank you so much for creating it.
I'm using rollama to create and run a query that return a 30 row CSV file with four columns, but the task is repeated 5, 10, 200, n times. It usually runs fine and a single query takes roughly 15 seconds to complete. But every once in a while, the task goes haywire, and it might take several minutes to complete. Here's an example. Notice the extreme discrepancy for run 33, which takes 254s and not 15.
The example uses mistral but the problem occurs in llama3 as well.
The problem appears to be pretty random. I tuned down the temperature to 0, which helped some. But I can't get it to go away.
So: is there a way to issue a "kill" command to a rollama query once it goes over a particular time threshold?