deiucanta / chatpad

Not just another ChatGPT user-interface!
https://chatpad.ai
GNU Affero General Public License v3.0
1.33k stars 252 forks source link

Streaming doesn't work when there are more than a few conversations #79

Open avelican opened 11 months ago

avelican commented 11 months ago

Edit: Issue affects all browsers. The reason I thought it was Brave-only is because it is my main browser, so the database had the most messages, causing the most lag.


Hello. This might be a bug with Brave, but I haven't had this issue with any other streaming LLM web interface, so I thought it was worth reporting. Streaming doesn't work in Brave. What I mean by this is that it will print either the first token, or nothing, freeze for the next 20 seconds, and then print the whole result. (While using 100% CPU the whole time.) Using latest Brave: Version 1.57.47 Chromium: 116.0.5845.96 (Official Build) (64-bit)

Same issue occurs with Brave Shields down. (+All extensions disabled.)

I think I've seen streaming work in Brave a few times, so the issue may not occur every time. (I installed it locally to see if that would help, and it apparently did, but then it went back to being extremely laggy.)

avelican commented 11 months ago

Update: Streaming works when there are very few conversations. Clearing data fixes the problem. Will investigate further.

avelican commented 11 months ago

Update: Exported data and tested in other browsers. Issue is not specific to Brave but occurs everywhere.

avelican commented 11 months ago

Update: Issue seems to be due to IndexedDB being slow. (In the profiler, most of the CPU time is spent on "System") I tested updating the DB only on every 3rd message, which "solves" the problem in the sense that it still lags horribly (updates once per paragraph) but at least it's "kind of" streaming again now.

(What confuses me is that 50% of the CPU time is still Idle, so I don't understand why it already lags so hard.)

I don't see an easy fix here. These are the options that I see:

  1. Use a different DB backend. If IndexedDB is performing so poorly, perhaps a better option exists. (I don't know enough about this area to comment further, and it seems like a lot of trouble to change the whole DB backend...)
  2. Commit messages to DB less frequently?

However, as far as I can tell, messages are displayed by the message components loading them from the database. Therefore, saving less frequently == rendering less frequently, i.e. create artificial lag! Bad UX!

How often to save to DB? Even if we alter the app to use some local cache for the message, and save only when a message finishes streaming, this complicates knowing when the message is "finished". Naive solution is getStreamContent getting isFinal == true. But if network transmission fails, this message will not arrive, and we still want to save what was received...

To my mind now (though it is late here), the best compromise seems:

  1. Use some local cache for rendering the message (so that saving less frequently does not break streaming).
  2. Save on every token, but with a lastSave timestamp so that it does not save more than once per 2 seconds or so, so it doesn't overload IndexedDB? Edit: This would still suffer from a network interruption not saving the last few tokens...

Again, this whole theory is based on the assumption that IndexedDB itself is the culprit. Browser being 50% Idle seems to disprove that, but maybe it's talking total CPU cores and IndexedDB is running on one thread and jamming up? I don't know.

Will take another look tomorrow...

Belphemur commented 10 months ago

Performance issue aren't related to IndexedDB, but actually to the way the app is built. Every time you write something in the message box, every message is re-rendered. Every time a message is updated (streaming) the whole conversation is re-rendered.

I fix all that in my fork: https://github.com/belphemur/chatpad

Supernova3339 commented 10 months ago

you can PR your changes, I can't provide an ETA when he will review everything

On Wed, Sep 6, 2023 at 5:56 PM Antoine Aflalo @.***> wrote:

Performance issue aren't related to IndexedDB, but actually to the way the app is built. Every time you write something in the message box, every message are re-rendered. Every time a message is updated (streaming) the whole conversation is re-rendered.

I fix all that in my fork: https://github.com/belphemur/chatpad

— Reply to this email directly, view it on GitHub https://github.com/deiucanta/chatpad/issues/79#issuecomment-1708844093, or unsubscribe https://github.com/notifications/unsubscribe-auth/APESZJTJX6IBY6GAYJZMPRTXZC2LXANCNFSM6AAAAAA3STRWOQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>