Issues with Streaming - Githubissues

HemulGM / DelphiOpenAI

OpenAI API wrapper for Delphi. Use ChatGPT, DALL-E, Whisper and other products.

MIT License

242 stars 58 forks source link

Issues with Streaming #4

Closed gspears00 closed 1 year ago

gspears00 commented 1 year ago

Thank you so much for this incredible library. I am trying to use this in a console based, streaming example. I can create a Chat, and get all data back in one return message. However when I try to use streaming, I get an error. The following console code works fine. I submit my chat, and I get the entire answer back in one "event". I would like the same behavior as the ChatGPT website, so the tokens would be displayed as they are generated. My code is as follows...

var buf : TStringlist; begin ... var Chat := OpenAI.Chat.Create( procedure(Params: TChatParams) begin Params.Messages([TChatMessageBuild.Create(TMessageRole.User, Buf.Text)]); Params.MaxTokens(1024); // Params.Stream(True); end); try for var Choice in Chat.Choices do begin

            Buf.Add(Choice.Message.Content);
            Writeln(Choice.Message.Content);
          end;
    finally
     Chat.Free;
  end;

This code works. When I try to turn on streaming, I get the EConversionError 'The input value is not a valid Object', which causes ChatGPT to return 'Empty or Invalid Response'. Any ideas appreciated.

HemulGM commented 1 year ago

Because it responds in this case not with a json object, but in its own special format.

Example

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": "\r", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": "\n", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": "1", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 2", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 3", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 4", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 5", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 6", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 7", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 8", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 9", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": ",", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: {"id": "cmpl-6wsVxtkU0TZrRAm4xPf5iTxyw9CTf", "object": "text_completion", "created": 1679490597, "choices": [{"text": " 10", "index": 0, "logprobs": null, "finish_reason": null}], "model": "text-davinci-003"}

data: [DONE]

HemulGM commented 1 year ago

You can choose not to use this mode at all and simply output data by manually separating it.

gspears00 commented 1 year ago

Thank you for your response. In my sample code above, when I turn on streaming, (by uncommenting the line) the OpenAI.Chat.Create call fails after a few moments. It is getting sent to the Chat website, but it apparently is not in the proper format. I never get to the code within the TRY block in order to parse the response.

HemulGM commented 1 year ago

OpenAI.Completion.CreateStream(
  procedure(Params: TCompletionParams)
  begin
    Params.Prompt(Buf.Text);
    Params.MaxTokens(1024);
    Params.Stream;
  end,
  procedure(Response: TStringStream)
  begin
    Writeln(Response.DataString);
    Writeln('-------');
    Sleep(100);
  end);

I experimentally did this

HemulGM commented 1 year ago

The event is fired multiple times as data arrives in the stream. Each line (--------) is a new event in the stream response

gspears00 commented 1 year ago

When I try to use the OpenAI.Completion.CreateStream, I am getting an error that CreateStream is an undeclared identifier. For my Uses clause, I am using:
System.SysUtils, System.Classes, OpenAI.Completions, OpenAI.Chat, OpenAI;

Do I need to add something else? I am using Delphi 11.1

HemulGM commented 1 year ago

I haven't released it yet

HemulGM commented 1 year ago

Pushed a new method for working in stream mode

HemulGM commented 1 year ago

Example

OpenAI.Chat.CreateStream(
  procedure(Params: TChatParams)
  begin
    Params.Messages([TchatMessageBuild.User(Buf.Text)]);
    Params.MaxTokens(1024);
    Params.Stream;
  end,
  procedure(Chat: TChat; IsDone: Boolean; var Cancel: Boolean)
  begin
    if (not IsDone) and Assigned(Chat) then
      Writeln(Chat.Choices[0].Delta.Content)
    else if IsDone then
      Writeln('DONE!');
    Writeln('-------');
    Sleep(100);
  end);

HemulGM commented 1 year ago

But, for the same effect of writing a response sequentially, I would still use a regular query with a full answer and draw a conclusion by words manually. This is much easier to do than streaming.

gspears00 commented 1 year ago

Thank you so much. This is incredible. The reason that I am wanting streaming is that for complex answers, ChatGPT can take over 60 seconds. Within my app, the user would submit a request, and see nothing happen for 60 seconds. If I stream the answer, like the ChatGPT website does, the end result is that the user can start reading within a few seconds, when ChatGPT starts generating the answer.