Open maxwell-bland opened 8 months ago
Update, this also has a back-end issue since Ollama's http server is stateful and does not handle canceling curl very well. Potentially the ollama backend should be reworked to use a pipe to ollama run
directly.
@maxwell-bland waiting for you to address my comments. 10x
@tzachar not sure they went through, the merge pull request (below) should highlight review changes?
I need to fix ollama's http server setup to add appropriate request cancellation so it doesn't get flooded, but do not have an abundance of time just yet, hopefully this weekend
@maxwell-bland You did not push any more changes to your branch.
@tzachar sorry if it was not clear: your comments did not get posted, as far as I can see. What is your opinion on the change? Were there some comments in the code that I missed?
I think it was fine to use curl in the short term, but it would probably be better to switch it to a subprocess that works via pipes since it would reduce latency and difficulties in managing requests. I will update this commit once I have some more time.
I looked at adding timeout/session management to ollama yesterday but it will be a bit of a slog.
@maxwell-bland you can see my comments above in this thread, with a pending
badge.
@tzachar apologies, I see no "view reviewed changes" similar to my own comment above.
potentially it is just github being unintuitive. Maybe you could leave your changes in a direct comment on the pull request? I checked through the inbox on the site, etc, and cannot find the comments anywhere
All my comments are included inside gitlab's review process. you can access it from the top of the pull request (https://github.com/tzachar/cmp-ai/pull/14/files/f4877c51e2c4354dea5eb3ffa95530b481b7c8d3). search for "view reviewed changes"
@tzachar if the comments show as pending that means they haven't been sent yet. If you write comments as part of a PR review (and not just as individual comments) you need to give a decision on the PR before they're posted - either Accept/Request Changes/Comment I believe.
There's a guide here about the process https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/reviewing-proposed-changes-in-a-pull-request
Thanks @alexandradeas My bad.
Supports proper streaming of inputs via curl and incremental building of autocomplete suggestions for Ollama. This is necessary on lower-end or non-GPU machines. Additionally, kills previous curl calls to ensure that multiple curls will not be fired in a row during autocomplete, which kills the ollama server.
Needs testing on non-ollama services (I have no access).
Thanks! Maxwell