Bug: Unhandled streaming API Error when max tokens are exceeded.

watsy0007 commented 1 year ago

Describe the bug

"{\n    \"message\": \"This model's maximum context length is 4097 tokens. However, your messages resulted in 58478 tokens. Please reduce the length of the messages.\",\n    \"type\": \"invalid_request_error\",\n    \"p
aram\": \"messages\",\n    \"code\": \"context_length_exceeded\"\n  }\n}\n"

To Reproduce

paste a long text
use gpt-3.5-turbo
use stream mode to execute code

Code snippets

No response

OS

MacOS

Elixir version

1.15.2

Library version

v0.2.1

restlessronin commented 1 year ago

@watsy0007 This is expected behaviour with the gpt-3.5-turbo model. The context window length (amount of input you can give it) of that model is restricted to 4K tokens. There are later models which have longer context windows.

If it's important, you can check for the approximate number of tokens in your input before calling openai. Use one of the gpt tokenizer libraries, egs. https://github.com/LiboShen/gpt3-tokenizer-elixir.

watsy0007 commented 1 year ago

thanks for your reply

i followed the instructions in the notebook and used the following code

def get_completion_stream(openai = %OpenaiEx{}, cc_req = %{}) do
    openai
    |> ChatCompletion.create(cc_req, stream: true)
    |> Stream.flat_map(& &1)
    |> Stream.map(fn %{data: d} -> d |> Map.get("choices") |> Enum.at(0) |> Map.get("delta") end)
    |> Stream.filter(fn map -> map |> Map.has_key?("content") end)
    |> Stream.map(fn map -> map |> Map.get("content") end)
  end

and then i get error as follows

 #PID<0.601.0> terminating
** (CaseClauseError) no case clause matching: "{\n  \"error\""
    (openai_ex 0.2.1) lib/openai_ex/http_sse.ex:44: anonymous fn/1 in OpenaiEx.HttpSse.to_sse_data/1
    (elixir 1.15.2) lib/enum.ex:1693: Enum."-map/2-lists^map/1-1-"/2
    (openai_ex 0.2.1) lib/openai_ex/http_sse.ex:40: OpenaiEx.HttpSse.to_sse_data/1
    (openai_ex 0.2.1) lib/openai_ex/http_sse.ex:32: OpenaiEx.HttpSse.next_sse/1
    (elixir 1.15.2) lib/stream.ex:1626: Stream.do_resource/5
    (elixir 1.15.2) lib/stream.ex:943: Stream.do_transform/5
    (elixir 1.15.2) lib/stream.ex:1828: Enumerable.Stream.do_each/4
    (elixir 1.15.2) lib/enum.ex:4387: Enum.reduce/3
    (qna3_server 0.1.0) lib/qna_server_web/live/qna_demo_live.ex:177: anonymous fn/4 in QnaServerWeb.QnaDemoLive.stream_response/3
    (elixir 1.15.2) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.15.2) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<1.43800414/0 in QnaServerWeb.QnaDemoLive.stream_response/3>

i expect the code to behave normally or throw some kind of RuntimeError ? Is this appropriate? if it is. i can spend time submitting a pull request.

restlessronin commented 1 year ago

@watsy0007 you are correct. I misunderstood the nature of the problem.

I'm a little busy at the moment, so I'd be happy if you go ahead and try to fix it with a PR.

So far, I have tried to avoid hard wiring any error handling in the library, preferring to leave it to user code. i would prefer to continue that approach of minimal wrapping around the bare http api, so please keep that in mind.

watsy0007 commented 1 year ago

So far, I have tried to avoid hard wiring any error handling in the library, preferring to leave it to user code. i would prefer to continue that approach of minimal wrapping around the bare http api, so please keep that in mind.

I completely argee with you. I will close this issue first.

restlessronin commented 1 year ago

@watsy0007 did you make some progress on this bug? i have some time to work on it ATM, and will proceed on my own if you're busy.

watsy0007 commented 1 year ago

@restlessronin not yet, i'm really looking forward to your solution. 😄

restlessronin commented 1 year ago

@watsy0007 I have published v 0.2.3 to hex with what seemed a reasonable fix for this issue. When the streaming endpoints return an error message, openai_ex returns an empty stream and logs the message as a warning. Since the intended use case is Livebook, the log output is visible in the debug pane, and the user gets visual feedback that there's a problem.

What do you think? Does this seem like a reasonable fix to you?

Valian commented 10 months ago

Just FYI it helped me, I was supplying one invalid parameter to streaming and couldn't see the response using 0.2.1, with your fix it was clearly visible 😉

restlessronin commented 10 months ago

@Valian thanks for taking the time to let me know. Good to learn that it was useful.

restlessronin / openai_ex