Error handling & JSON parsing

r0man commented 3 months ago

Hi @ahyatt,

I tried out the Vertex and OpenAI providers and run into 2 issues:

The first one was that json-read-from-string was called on a plz-response object on success. On success the body of the response is already decoded, according to the media type that handled it, so we don't need this.
The other issue was with handling error objects in Vertext and Gemini. The :else callback of the json-array media type doesn't contain the error messages. The media type always streams back the objects in the JSON array via the :handlers. In case of errors, the Vertex and Gemini (I think, I don't have access to it) response is also an array, with the elements being the error objects.

The errors are now handled in the llm library, but as a user you see this message:

error in process filter: Error calling the LLM: Problem calling GCloud Vertex AI: status: 400 message: Please ensure that multiturn requests alternate between user and model.

The error is not important here. They come from a request failing an me trying again and Vertex getting upset the something is not in order. But the error in process filter: is kind of bad. I'm wondering what I should do in the cases when a function running in the the process filter raises an error, and if this is even something the plz-media-type request function can/should do. From the point of view of the plz-media-type function everything is ok, it got an error response, parsed it correctly, delivered the events to the the handler. The handler decided to raise an exception which plz-media-type is unlikely able to handle. So, maybe the hanlder should not raise a user-error here? Or we should catch it in the llm library and just inform the user via a message?

Wdty?

ahyatt commented 3 months ago

Thank you for catching the error on on-success. The fact that the tests didn't catch this is a bit disturbing. I'll look into why that was. For these media-types, it isn't clear to me as a client what I should be expecting on the :then call when using streaming media-types.

About errors, I think ideally you could have a new param to plz-media-type-request, something like :with-media-signal-handler, in which you can supply a function to handle a signal and it's error message (probably just using the standard conventions for signal handling). It can print a message. Or, it can rethrow the error, which is the default. If it rethrows the error, the connection should be closed to avoid a situation which I had repeatedly with this, which is that I get hundreds of error messages causing emacs to beep repeatedly.

This probably is worth checking with @alphapapa as well, since it should be that this matches the design philosophy of plz.

r0man commented 3 months ago

Your suggestion sounds good. I will give it a try.

r0man commented 3 months ago

@ahyatt I just added a bit more documentation. I hope it clarifies some of your doubts. Please see: https://github.com/ahyatt/llm/pull/35/commits/ec7d38ae1e677519094de8679038df32d330c993

r0man commented 3 months ago

The :then and :else callbacks always will be passed the following:

:then the plz-response struct (the body slot might be nil when the media type class uses streaming)
:else a plz-error object (which can be of kind plz-error, plz-curl-error or plz-http-error) Everything else is done with media type specific handlers.

r0man commented 3 months ago

Hi @ahyatt,

I updated this PR. The OpenAI, Ollama and Vertex providers run fine with the tester.

I had to change the Ollama provider a bit, because:

The llm-chat-async method raised an error. It looks like it assumes an OpenAI error response, and it crashes because the Ollama response is different.
I saw this error: "json: cannot unmarshal object into Go struct field Message.messages.content of type string" This happened, because not the message string got added to the prompt, but an object. The next request then failed because of this.
I increased the timeout to 2 mins. Otherwise I could not get the Ollama provider to pass the tests. I also noticed responses timing out with the Vertex provider sometimes (I'm using the Gemini 1.5 model which is a bit slow).

I left the error handling for errors that happen in the process buffer as a todo, since I'm not sure how you want to deal with this. I think it would be the same for all providers.

I did this slightly different than what you suggested. I catch the error in the process filter, and it gets passed to the ELSE callback instead of adding an additional callback to handle errors. So the :else callback has now 3 error cases:

A curl error (timeout, DNS error, not intetrnet etc.) This is a plz-error that has the curl-error slot set, the response slot is set to nil.
A HTTP error. This is a plz-error where the curl-error slot is nil, and the response slot has a plz-response struct.
A process filter error. This is a plz-media-type-filter-error which inherits from plz-error. It has the response slot set, and a cause slot that contains the error that was raised in the process filter.

ahyatt commented 3 months ago

Thanks for the work on this, I'll merge this and test it out on my side.

r0man commented 3 months ago

Ok cool. Thank you!

ahyatt commented 3 months ago

I've changed how errors work, but otherwise all this looks great, thank you!

ahyatt / llm

Error handling & JSON parsing #35