robtaussig / react-use-websocket

React Hook for WebSocket communication
MIT License
1.59k stars 135 forks source link

React is jumbling up a websocket stream of data #247

Open Bryson14 opened 1 week ago

Bryson14 commented 1 week ago

Hello, I'm seeking help on why data is being jumbled, missed, and dropped when rendering from a websocket. I want to try making a ai chatbot for fun and learn about how websockets and SSE work. I have been banging my head for a few hours now because data is not appearing at it should.

Example

I have a fast api and is streaming data from openai api and sending it to a website. i have confirmed with backend logs and with the firefox websocket network tab that the data is getting to the website in order and all intact.

For example:

Q: what is your favorite color?

Expected A: "I don't have personal preferences, but I can help you find information about colors or suggest color schemes if you need assistance with that!"

Rendered A: "I have preferences but I help you find about or color schemes you need with"

Q: Tell me a joke

Expected A: "Sure, here's a joke for you:\n\nWhy couldn't the bicycle stand up by itself?\n\nBecause it was two tired!"

Rendered A: "Sure's for couldn stand by it was"

Code

import useWebSocket, { ReadyState } from "react-use-websocket";

export const ChatApiProvider: React.FC<ChatApiProviderProps> = ({
  children,
}) => { 
 const [messageHistory, setMessageHistory] = useState<ChatMessage[]>([]);
  const [isGenerating, setIsGenerating] = useState(false);
  const [input, setInput] = useState("");
  const [firstTokenReceived, setFirstTokenReceived] = useState(false);
  const messageBufferRef = useRef<string>('');
  const [editingMessage, setEditingMessage] = useState<ChatMessage | undefined>(
    undefined
  );
  const { toast } = useToast();

  const {
    sendMessage: sendWebSocketMessage,
    lastMessage,
    readyState,
  } = useWebSocket(WEBSOCKET_URL);

  /**
   * Responds to new data on the socket
   */
  useEffect(() => {
    if (lastMessage !== null) {
      try {
        const res = JSON.parse(lastMessage.data);
        handleWebSocketMessage(res);
      } catch (error) {
        console.error("Error parsing message:", error);
        setIsGenerating(false);
        toast({
          title: "Error",
          description: "Error parsing message from server",
        });
      }
    }
  }, [lastMessage]);

  /**
   * Handles each type of message on the socket
   * u/param res json message from the backend
   */
  const handleWebSocketMessage = (res: any): void => {
    switch (res?.type) {
      case "content":
        handleContentMessage(res.data);
        break;
      case "summary":
        handleSummaryMessage(res.data);
        break;
      case "finished":
        handleFinishedMessage();
        break;
      case "error":
        handleErrorMessage(res.data);
        break;
      default:
        console.warn("Unknown message type:", res?.type);
    }
  };

  const handleContentMessage = (content: string) => {
    setFirstTokenReceived(true);
    messageBufferRef.current += content;
    updateMessageHistory(messageBufferRef.current, "assistant", false);
  };

  const handleSummaryMessage = (summary: string) => {
    console.log("Getting summary");
    updateMessageHistory(summary, "assistant", true);
  };

  const handleFinishedMessage = () => {
    setFirstTokenReceived(false);
    setIsGenerating(false);
    messageBufferRef.current = "";
  };

  const handleErrorMessage = (errorMessage: string) => {
    console.error("Error from server:", errorMessage);
    setIsGenerating(false);
    toast({
      title: "Error",
      description: errorMessage,
    });
  };

  const updateMessageHistory = (
    message: string,
    role: "assistant" | "user",
    isNewMessage: boolean
  ) => {
    setMessageHistory((prev) => {
      const newHistory = [...prev];
      if (!isNewMessage && newHistory.length > 0) {
        const lastMessage = newHistory[newHistory.length - 1];
        if (lastMessage.role === role) {
          lastMessage.message = message;
          return newHistory;
        }
      }
      newHistory.push({
        id: uid(),
        message: message,
        role: role,
        date: new Date(),
      });
      return newHistory;
    });
  };

I'm just streaming data token by token from the open api chat in a content payload. then i tried to fix it by adding a summary payload which sends the entire message once it's all done. But that is not helping.

robtaussig commented 1 week ago

Can you give me an example of your top level application specifically how/where you are using ChatApiProvider? Based just off of your example:

Q: what is your favorite color?

Expected A: "I don't have personal preferences, but I can help you find information about colors or suggest color schemes if you need assistance with that!"

Rendered A: "I have preferences but I help you find about or color schemes you need with"

It looks to me like you are asking the bot the same question twice? Though it's also strange that in both of your examples, the "actual" rendered answer has typos/incorrect grammar...

There's also a few things that stand out to me in your code:

1) In your useEffect, you call handleWebSocketMessage (which itself calls/accesses a lot of functions/state) without including it and them in your dependency array. This introduces a lot of possible bugs involving closures and stale references.

2) I see where you are coming from with needing something like a messageBufferRef.current, but this is a bit of a code smell to me. When you mutate messageBufferRef.current, you are not necessarily triggering a component update, so if you are also rendering messageBufferRef.current, then it is entirely possible that your rendered text is one message behind your chat history. Anything that is used to render JSX should be from useState, as your component is forced to update whenever it's changed (correctly, via the state setter).

3) This looks questionable to me:

        const lastMessage = newHistory[newHistory.length - 1];
        if (lastMessage.role === role) {
          lastMessage.message = message;
          return newHistory;
        }

While you created a shallow clone of your messageHistory state before mutating it, lastMessage is an object in your current state that you are mutating with lastMessage.message = message; -- again, depending on how you are using messageHistory (passing it down the app through context? Are there message components that are memoized on the message prop that you are mutating (but will not register as a change to them?), this can cause a variety of bugs.

4) This won't necessarily fix your problem, but the hook returns a lastJsonMessage in addition to a lastMessage, so you don't need to parse it yourself.


My suggestions to debug this would be to start from the basics -- instead of going straight for a Provider/Consumer pattern, call useWebsocket directly by your components. A big point of the library is that it allows you to "share" a websocket connection (if you set share: true) across your app, without having to worry about context. I do understand that you might need a Provider in order to retain a source of truth formessageHistory, but for purposes of debugging, let the component that renders your messages retain the chat history (which you will obviously lose if/when that component unmounts). But the first thing that came to mind when I read your code and thought about your bug, is that you might be creating multiple chat instances, and/or web socket connections, possibly due to having multiple Providers (based on the fact that you are receiving multiple messages from the bot that seemingly address the same question, but worded differently).

I'd also add a console.log in your component body directly:

  const {
    sendMessage: sendWebSocketMessage,
    lastMessage,
    readyState,
  } = useWebSocket(WEBSOCKET_URL);
console.log(lastMessage);

This will allow you to see the order in which the messages are being passed in, without worrying about bugs that might be introduced when you add complexity like useEffect and useRef. I'd also suggest using share: true for the reasons I mentioned above.

Also, instead of your useEffect, try:

  const {
    sendMessage: sendWebSocketMessage,
    lastMessage,
    readyState,
  } = useWebSocket(WEBSOCKET_URL, {
    onMessage: (message) => {
       console.log({ message });
       handleWebSocketMessage(res);
    }
  });

That callback is invoked directly from the websocket message callback, and so it's closer to the source than lastMessage, which comes through react state. While I've never seen lastMessage be out of sync with what you've received through the websocket, it would at least eliminate one avenue of bugs.