Open shafiqalibhai opened 5 months ago
Seconded, since sometimes the initial load of the model into Ollama times out for big models and you have to re-submit your prompt. Once it's "warmed up" it's fine.
Maybe there's an alternative way for the program to see if Ollama is still running and just taking a long time to respond.
Totally agree on this. timeout needs to be increased.
This is importart for folks with low end hardware. I agree, should be in the settings.
It's important even for high end hardware if you're using a giant model. Sometimes the initial model load times out and you have to resubmit, after which it works.
100% i'm running a 16 core epyc as my LLM machine - it really chugs trying to load mixtral 8x22b even if loading of NVME into ram.
+1
+1 Running on a 2690 v4 Xeon in a Alpine Linux VM on proxmox.
Can confirm that this is still a problem. First attempt to use llama3.1:70b on an M1 Max laptop times out waiting for a response. If I "edit" the question and resubmit it works fine.
Have the same problem with larger models on my computer. An adaptable timeout setting would be awesome
Second this. I am getting cut off responses with my remote ollama setup. Since other clients are working, I imagine it may be caused by timeout
Maybe add a variable in settings to change the default timeout setting.