Closed r0man closed 6 months ago
Copying my comments from the commit:
@r0man A few thoughts:
plz
with something else.plz
to those strictly necessary. As much as possible, the functionality should be implemented on top of plz
's existing functionality.Hi @alphapapa,
this is a PR to the plz
branch to get the ball rolling, not the main branch. The idea is to use this branch until all providers have been changed to use curl, possibly make changes to the plz-xxx files, until we know better howto use plz from the llm library. Once this is ready to be merged into the main branch we get rid of the files and/or rename them.
Understood. Thanks.
Thanks for doing this. I agree about the concern, but the purpose here is to vet ideas, not have this code ever see the light of day. I'll merge this in.
About the continuous integration, we already have one set up, so no problems there.
If you want to help convert, that's great, but I think it's good that I convert a bunch of them, which I can start today, so I can provide feedback on your API and let you know about problems with a client perspective.
Don't worry about plz testing - the only tests I think it makes sense to run are the llm ones. You should test Open AI with (llm-tester-all your-openai-provider)
and inspect the buffer to make sure there are no instances of "ERROR".
@ahyatt Nice, didn't know about llm-tester-all
!
Ok, sounds like a plan. I'll wait for your feedback.
Maybe we can use SSE for Vertex as well. I recently found this one here: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/streaming#rest-sse
If you have a curl response dump for the other "streaming" format(s), I could look into adding support for those formats.
The SSE thing is interesting, they must have added that recently. But it doesn't support Gemini, so I'd prefer to wait until it does.
One question: do you want me to ask questions and have discussion here or should I just make assumptions and check in fixes? For example, I just got llm-vertex
embeddings to work, but had to change llm-request-plz-async
to return (plz-response-body response)
instead of response
in the success callback. I can check in my change, but perhaps we should discuss these issues first. Let me know what makes the most sense to you.
Yes, I also had the impression SSE on Vertex does not work in all situations. I'm fine with waiting.
Please feel free to check in fixes. I'm also happy to answer questions. As you prefer. I comment on the other PR.
Hi @ahyatt
This adds plz to the LLM library. Or at least temporarily.
plz.el is a slightly modified version of upstream.
It adds a slot for the process object to the response struct, so we can access properties of the process in various places. In our discussion about the plz streaming PR, alphapapa suggested this at some point.
The plz function has an additional option called :process-filter that allows setting the process filter when the curl process is created with make-process. This is needed to set the process filter in the synchronous and asynchronous cases. In the asynchronous case you could to this immediately after the process is created. In the synchronous case you can't get at the process object to install a filter.
Ideally we could upstream the above.
plz-media-type.el contains code that can be used implement and customize response decoding for various media types. It ships with default media types for application/json, application/html, application/xml and the default application/octet-stream. A media type can support "normal" and "straming" formats.
plz-event-source.el contains a media type implementation for text/event-stream, aka server sent events, and an implementation of an event source class and parser according to the HTML living standard.
llm-request-plz.el A copy with the same functionality as the llm-request module, but using plz instead of url-retrieve. We could eventually rename the llm-request-plz module to llm-request once we are done with adding support for curl to all providers.
Some open questions and suggestions:
The plz files are copied into the llm repository from this branch https://github.com/r0man/plz.el/tree/plz-media-type The branch from which I copied the files has tests that I did not copy over (yet, should I), because they require a running httpbin container. We can either keep iterating on those files over there (maybe cumbersome), or we do it here (but let's include the tests then). Which brings me to my next suggestions:
Should we setup continuous integration? We could use https://github.com/alphapapa/makem.sh
I would suggest let's merge this into the plz branch, and we could both use those files then to work on converting providers to use plz. I have already started on the OpenAI provider, and will open a draft PR as well (it lacks streaming function calling, which I need to understand first).
If we can convince @alphapapa to support us with the changes made to plz.el, I could then create a separate repository for the media type and event source extensions.
Wdyt?