Closed dezoito closed 7 months ago
A quick local implementation of the idea above works:
However, implementing it as is changes the function signature and makes the timeout
parameter required (even if set to None
).
That would introduce a breaking change, hence why I'm not submitting a PR at the moment... suggestions?
I'm not really sure how to integrate that, as I don't really want to change the functions signatures. It would be much easier if ollama had a parameter for that, this way we could just add it to the request. That might be something to suggest ollama add.
Yeah, I gave it some more thought and think it would require coding some magic function overloading, and also that since this is not natively implemented by Ollama, there is a point for not having it in the client crate.
I might try to implement a wrapper in my Tauri app sometime.
Thank you again for your consideration!
That might be something to suggest ollama add.
Done
https://github.com/ollama/ollama/issues/2764
Curiously, ollama-py implements this, but the timeout happens at their http client and is not passed to the ollama host either.
class BaseClient:
def __init__(
self,
client,
host: Optional[str] = None,
follow_redirects: bool = True,
timeout: Any = None,
**kwargs,
) -> None:
"""
Creates a httpx client. Default parameters are the same as those defined in httpx
except for the following:
- `follow_redirects`: True
- `timeout`: None
`kwargs` are passed to the httpx client.
"""
Personally I think this is a lot cleaner to do on the client, like ollama-py does. It prevents a bunch of failure modes, such as a misconfigured reverse proxy, from causing ollama-rs to hang indefinitely.
I've since implemented it on my client application's end.
I was thinking about adding the possibility of adding a
timeout
param for completion (and possibly chat) calls.Something like:
I believe the
timeout
arg would have to be added to thegenerate()
call, and not to therequest
struct (like with the other parameters and options) because this is not an Ollama parameter, but an option for thereqwest
call.The motivation is to allow clients to give up on requests that Ollama is taking to long to handle or has a request hung due to unexpected conditions.
I'm not sure if
.timeout(Duration::from_secs(timeout.unwrap_or(u64::MAX)))
is the correct way to handle the case when the timeout is NOT passed, either... this is just an idea.Would appreciate some thoughts on this.