thmsmlr / instructor_ex

Structured outputs for LLMs in Elixir
https://hexdocs.pm/instructor
430 stars 48 forks source link

json response error: gpt-4-vision-preview #36

Closed petrus-jvrensburg closed 2 months ago

petrus-jvrensburg commented 3 months ago

Currently I'm getting the following error when trying to run the GPT4-Vision livebook example:

** (MatchError) no match of right hand side value: {:error, "Invalid JSON returned from LLM: %Jason.DecodeError{position: 0, token: nil, data: \"\"}"}

Any idea what might be causing it?

petrus-jvrensburg commented 3 months ago

It looks like things were tripping on the stop sequence "```", because the model was starting its response with that sequence. Must be some recent updates to the model, because I don't see an obvious change in the code from when it was working before.

zvlasic commented 3 months ago

Yeah, I have the same problem. Interesting thing is that it sometimes works, but very rarely.

TwistingTwists commented 3 months ago

@petrus-jvrensburg I've faced that a lot. One thing I found that worked surprisingly well was the prompt that marvin library uses when it interacts with vision api of openai

marvin prompt link

Actual Prompt I ended up using that works well: (has failure rate like 1 / 10)

"""
  You are partnering with another AI to complete an objective. The
  objective is written below EXACTLY as it will be shown to the other AI.
  However, the other AI can not process images. YOUR job is to examine
  these images and produce a succinct response that contains any
  image-based information relevant to the objective. You should take all
  other aspects of the objective into account, but your only
  responsibility is to translate the images into relevant data. 

  Do not tell the other AI what to do or return, as it will get confused.
  Just return a description of the image that contains any detail the
  other AI can use to generate its own response. You may be as succinct as
  possible.

    DO NOT add any detail to the above prompt, just use it AS-IS.

  Here is the objective, verbatim:

  """

make it a system prompt or prompt it before image in the messages.


Other than using above prompt, retrying 2-3 times guarantees the output. So, I have written a little custom retry logic to handle

** (MatchError) no match of right hand side value: {:error, "Invalid JSON returned from LLM: %Jason.DecodeError{position: 0, token: nil, data: \"\"}"}

this way, on 150 images, I was able to achieve my task 100% ! so a combination of above two is what I would recommend.

TwistingTwists commented 3 months ago

cc @thmsmlr : Do you think we should add the above prompt in docs? Original source of prompt: Marvin AI

TwistingTwists commented 3 months ago

Suprisingly, one more very important thing to note is the following line is an important addition to the given prompt:

   DO NOT add any detail to the above prompt, just use it AS-IS.

@petrus-jvrensburg I've faced that a lot. One thing I found that worked surprisingly well was the prompt that marvin library uses when it interacts with vision api of openai

marvin prompt link

Actual Prompt I ended up using that works well: (has failure rate like 1 / 10)

"""
  You are partnering with another AI to complete an objective. The
  objective is written below EXACTLY as it will be shown to the other AI.
  However, the other AI can not process images. YOUR job is to examine
  these images and produce a succinct response that contains any
  image-based information relevant to the objective. You should take all
  other aspects of the objective into account, but your only
  responsibility is to translate the images into relevant data. 

  Do not tell the other AI what to do or return, as it will get confused.
  Just return a description of the image that contains any detail the
  other AI can use to generate its own response. You may be as succinct as
  possible.

  Here is the objective, verbatim:

  """

make it a system prompt or prompt it before image in the messages.

Other than using above prompt, retrying 2-3 times guarantees the output. So, I have written a little custom retry logic to handle

** (MatchError) no match of right hand side value: {:error, "Invalid JSON returned from LLM: %Jason.DecodeError{position: 0, token: nil, data: \"\"}"}

this way, on 150 images, I was able to achieve my task 100% ! so a combination of above two is what I would recommend.

TwistingTwists commented 3 months ago

One more fine addititon to this was ->

  1. change instructor.ex to return Jason.DecodeError
  2. pattern match on that and make second API call asking gpt to reformat it as json.

it works reasonably well.

This is the approach that marvin also uses.

Will raise relevant PR sometime soon

petrus-jvrensburg commented 3 months ago

One more fine addititon to this was ->

  1. change instructor.ex to return Jason.DecodeError
  2. pattern match on that and make second API call asking gpt to reformat it as json.

it works reasonably well.

This is the approach that marvin also uses.

Will raise relevant PR sometime soon

I'd be keen on this in a PR :)

TwistingTwists commented 3 months ago

@petrus-jvrensburg : Very soon!

petrus-jvrensburg commented 2 months ago

FYI - example is fixed in 75f1a23