OvidijusParsiunas / deep-chat

Fully customizable AI chatbot component for your website
https://deepchat.dev
MIT License
1.43k stars 218 forks source link

Assume partial responses with websocket connection mode #112

Closed alexeyeryshev closed 7 months ago

alexeyeryshev commented 7 months ago

Hey @OvidijusParsiunas, thank you for doing this; it looks very cool!

While playing with some examples, I noticed some unexpected behavior (issue?) with the websocket connection mode. It doesn't look like websocket handler can assume "partial responses" as streaming mode does. I have the following server code in FastAPI

@app.websocket("/api/deepchat/v1/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            body_dict = await websocket.receive_json()
            deep_chat_request = DeepChatRequest(**body_dict)
            messages = list(map(to_langchain_message, deep_chat_request.messages))
            text = ""
            async for chunk in chat.astream(messages):
                text = chunk.content
                await websocket.send_text(
                    DeepChatResponse(text=text).json(exclude_unset=True)
                )
    except WebSocketDisconnect:
        pass

and the following client

import { DeepChat } from "deep-chat-react";
    <DeepChat
      request={{
        url: "ws://localhost:8000/api/deepchat/v1/ws",
        websocket: true,
      }}
      onNewMessage={(body) => {
        console.log(body);
      }}
    />

Example of messages that server is sending

image

When sending partial websocket messages, I would assume that they would simply add to the same message bubble, however every new message seems to be converted to a new one. Please see a demo below:

https://github.com/OvidijusParsiunas/deep-chat/assets/7483130/8c691a37-4b43-4bfd-9c23-d8330f7b41d0

alexeyeryshev commented 7 months ago

Btw, I could help implementing / fixing this with some pointers from where to start!

OvidijusParsiunas commented 7 months ago

Hi @alexeyeryshev. By default, the websocket functionality assumes that each message that is sent back to Deep Chat is complete. In your case - it appears you are looking for a stream-like experience with websockets.

There is a number of ways you can tackle this:

  1. The simplest approach that you can take is instead of sending each individual chunk - send the total new text and overwrite the previous message. So instead of doing text = chunk.content, you can perhaps do text += chunk.content and when you send a message, include the overwrite: true property in the message body. See here.

  2. Another way would be to return complete messages and simulate a stream. Having looked at your server code, it appears you are using the following code to split messages into chunks:

    async for chunk in chat.astream(messages):
    text = chunk.content
    await websocket.send_text(
        DeepChatResponse(text=text).json(exclude_unset=True)
    )

    Could you change the code to avoid the chunking or build the complete text using text += chunk.content and send it to Deep Chat as a whole? If this can be achieved, all you will need to do in Deep Chat component is add the following:

    stream={{simulation: true}}

    You can change the simulation intervals by assigning it with a number e.g. simulation: 10. See more in stream.

  3. The third approach is a little more complicated and involves the us of a handler function to control the connection yourself. Similar to the first approach, the text from each chunk is aggregated and the overwrite property is used to overwrite the previous message. Additionally, you will need some sort of a special stop keyword that would be sent from the server to indicate that the stream has ended so that the text can be refreshed. The following is a simplified example and please refer to the Websocket section of the handler documentation:

    request={{
    websocket: true,
    handler: (_, signals) => {
    const websocket = new WebSocket('custom-url');
    let text = '';
    websocket.onopen = () => {
      signals.onOpen(); // enables the user to send messages
    };
    websocket.onmessage = (message) => {
      const response = JSON.parse(message.data);
      if (response.text === 'stop keyword') {
        text = '';
      } else {
        text += response.text;
        signals.onResponse({text, overwrite: true}); // displays a text message from the server
      }
    };
    },
    }}
  4. Instead of sending each chunk in the above example and overwriting the last message you can also wait for the full message to be aggregated and send it all in one response. To simulate a stream use the stream={{simulation: true}} property mentioned previously for point 2.

Let me know if you need any further assistance. Thanks!

alexeyeryshev commented 7 months ago

Hey, @OvidijusParsiunas, thanks for your swift response!

  1. I tried this approach, and while this is working, it's not an optimal solution bandwidth-wise, as we always need to send the same content over the wire, over and over again.
  2. (4) While I could do this, it doesn't seem correct to me as what I have on the backend is a true stream, and waiting for it to be complete before sending over the wire and then simulating the streaming response doesn't seem like a good solve to me!
  3. It seems like the best option for the moment, as it allows us to keep the stream all the way to the front end.

I believe, conceptually, it should not matter from which protocol you get your events (SSE, websockets, sockets.io, ...), the deep-chat could treat them as complete or partial, allowing a stream-like experience.

Thanks again for your guidance!

OvidijusParsiunas commented 7 months ago

Hi @alexeyeryshev. I thought about this a little further, and whilst there are a couple of workarounds to achieve streaming via websockets - I guess this can be a rather common use-case for some apps. In the light of this I have made a slight change to the API to facilitate it.

You would first connect to the websocket API as normal, and additionally to let Deep Chat know that this is a stream - I have made a change to the simulation property which now accepts a string value. This value is the stop key that will signal Deep Chat that the stream for a particular message has been finished.

Example:

<DeepChat
  request={{url: "ws://localhost:8000/api/deepchat/v1/ws", websocket: true}}
  stream={{simulation: "stop-key"}}
/>

In the example above - the stream for each message will end when the server sends {text: "stop-key"}.

I have also made a change to allow files in streams, so you will be able to respond with any content you need.

All of these changes are available in deep-chat-dev and deep-chat-react-dev packages version 9.0.124. These packages are exactly the same as the main ones, it's just their names are different.

Let me know if this works for you. Thanks!

OvidijusParsiunas commented 7 months ago

This has now been deployed in Deep Chat version 1.4.11. See stream for the updated documentation.