OvidijusParsiunas / deep-chat

Fully customizable AI chatbot component for your website
https://deepchat.dev
MIT License
1.26k stars 170 forks source link

Better handling of streaming responses #147

Open chirstius opened 3 months ago

chirstius commented 3 months ago

When using streaming responses the chat window does not scroll to keep the latest response within the chat window: deep-chat-streaming

Is there a setting for this I am missing? Or should it be scrolling and it's just not working on my instance?

Further - for custom models it would be nice to be able to define the message response format - our model does SSE to match OpenAI - which you support, but we have to change all the SSE messages to your bare bones format. I think it would be greatly more flexible to allow the user to specificy via a configuration object what field to use for the text content rather than forcing the need for a custom handler - meaning I feel like that should be "configuration" not "code"

Thank you!

chirstius commented 3 months ago

Also... should streamed responses be coming in "one character at a time"?

Was looking at stream.ts and it seems like you're using a mechanism to add the characters one at a time, but they seem to be coming in in chunks

OvidijusParsiunas commented 3 months ago

Hi @chirstius.

Thankyou for providing an example gif to illustrate the problem. The chat should be auto-scrolling to the bottom so there definitely is a problem! (to note - if you manually scroll up when the stream message is being populated then the auto-scroll will not work, similar to how ChatGPT works)

I have attempted to reproduce the problem but have had no luck. Is there any chance you could possibly share the response text you use to reproduce the problem so that I can see if I can re-emulate it on my end to see what is causing the problem.

OvidijusParsiunas commented 3 months ago

Also if you are trying to create a similar API to ChatGPT, you can define the openAI in directConnection and then define your url in the request property. This will pretty much override the target url endpoint. React example:

<DeepChat
  request={{
    url: 'custom-url',
  }}
  stream={true}
  directConnection={{
    openAI: {
      chat: true,
      key: 'mock-key',
    },
  }}
/>

You can also add extra headers and additionalBodyProps to help call your API with anything else that may be required.

I also recommend trying to use the request and response interceptors to help augment the intercepted bodies. Though the response interceptor will be a little trickier when streaming due to multiple events being sent back.

For your final question on whether the responses should be "one character at a time", the answer is a definite no and you should be able to respond with as many characters/words as you like. The code that you may have looked at may have been related to streaming simulation here, and the reason why we use text.split('') is to be able to recognise each Chinese character as unique word and consecutively populate the message with each characters. This is related to the following issue.

Let me know if you need help with the information above.

Thanks!

chirstius commented 3 months ago

I was wondering if that would work - let me give that a try and see how things behave before worrying about the rest. Great tip, thank you!

For the scrolling, I didn't touch anything - it starts for a moment, it DOES move.. for like a line, and then just... stops. If leveraging the openAI tweak above does not sort things I will capture the output stream and send it over to you. Thank you for the quick response!

chirstius commented 3 months ago

I have tried using the OpenAI + request definition suggested above but it seems to not be working as expected. I am using Vue.

<deep-chat
    :request="{
      url: 'https://somehost/streaming?tokenize=false',
      method: 'POST',
    }"
    :responseInterceptor="
      (response) => {
        console.log(response);
        return response;
      }
    "
    :stream="true"
    :directConnection="{
      openAI: {
        key: 'not-a-real-key',
        chat: { system_prompt: 'Assist me with anything you can' },
      },
    }"
    style="border-radius: 8px"
  ></deep-chat>

The response streams but still doesn't seem to be auto-scrolling as hoped:

no-scroll

I added an interceptor just to see what is getting back to the browser and despite a slight tweak I am making to "up-level" the response content, the response is a standard OpenAI SSE stream response:

image

Here are two samples responses for the same prompt - one from openAI directly, one from my openAI "proxy" that is re-streaming the response:

sample-response.txt openai-response.txt

Again, I am pulling the "content" field up out of the choices array for ease of consumption in some post-processing I am doing, but I don't tamper with the raw response itself in any other way - anything expecting fields from an openAI SSE response would find what it's looking for where it thinks it should be.

Hopefully you can review and I am open to any thoughts you might have to make this work. I am really impressed with what you've put together and I was hoping it would drop right in - and mostly it does, but the scrolling is a hard requirement. I'm truly not sure why it's not working.

Also you are correct I was looking at the code around "simulate" - if you're NOT doing that for SSE/streamed responses, do you think perhaps you could? Like have the responses "accumulate" into a buffer as they arrive and then allow the "per character delay" to be set via parameter and have a fluid/smoothed "typing" response from the model?

or if you're not interested in that, can you at least point me to the right spot in the code to possibly implement it as clearly I was looking in the wrong place haha

Thanks!

OvidijusParsiunas commented 3 months ago

Hey @chirstius. Thankyou for providing the exact responses, I will spend some time in replicating the issue and hopefully solve it.

The suggestion to accumulate the stream messages and then simulate them is unfortunately too much of an edge case for the standard Deep Chat API, so I don't think I can implement that, however devs that do want to implement it can use the handler for streaming.

I would like to apologise for my late responses but it is currently a holiday where I live (UK), and I did not have much time to look at the code. Rest assured this is one of bigger issues that I am very keen to solve and will update you on this tomorrow.

chirstius commented 3 months ago

Do not apologize! Enjoy your time off! I appreciate the update. I posted a new issue for help with dynamic configuration that hopefully you have a chance to help with but it's not crucial! I also have a viable workaround for the scrolling for the time being - it's not great but it works well enough.

Thanks for the pointer to the streaming handler, I may take a look at it and see if I can smooth things out a little.

Curious to see what you discover, if anything, as far as the scrolling - take care!

OvidijusParsiunas commented 3 months ago

Hi @chirstius. I have attempted to replicate the issue you had on your end using the data provided and everything appears to work correctly for me. I have also attempted to use the standard OpenAI Streamed Chat API, and everything worked for me just fine.

I have a feeling that it could be something to do with your browser/machine, but before we jump onto that conclusion, may I ask if you could try to use the OpenAI Chat API and have stream set to true which will enable the streamed experience. I am wondering if scrolling will work as expected for you in that instance.

The code that controls this is in the following line and it is used to check if the user has by any chance moved the scroll bar up manually, if they have then isScrollbarAtBottomOfElement will be set to false and hence prevent any further scrolling:

const isScrollbarAtBottomOfElement = ElementUtils.isScrollbarAtBottomOfElement(this._messages.elementRef);

The isScrollbarAtBottomOfElement (here method might be doing something wrong, but I cannot tinker with it as I am unable to reproduce your problem. If all fails, is there any chance that you could fork/clone the repo, start up the component and see if any of its code is causing the problem. (To note, you may get a few warnings in console/ts as the new version contains quite a few updates to our API, but everything should still work fine right out of the box)

Much appreciated! Thanks!

chirstius commented 3 months ago

I'll take a look. To your point it's possible there's some odd interaction happening specifically in my environment. Thank you for looking into it and providing the relevant code resources. I'll be doing more work this weekend and see what I find.