Feature Request: User friendly solution to token limit issue.

haroldwinstob commented 1 year ago

Hi, firstly I would like to thank you for creating and sharing this app for free. It really is a great, simple web-ui app that prevents common problems on offiicial ChatGPT app like slowness, too many requests, lost connection on idle.

Let me offer an idea that might help with token limit issue. It's not a perfect solution but you will get the idea.

Button that instantly remove first chats in a session. ChatGPT should still be able to follow context using last 4096 tokens in the session. User can set number of chat that will get removed.
Chat pin. This is pretty much the solution to flaw for no. 1. What if users don't want to remove certain chats to keep long-term context? Pinned chat cannot be removed by remove chat button.

If there are technical aspects that makes implementing above features isn't as simple as expected please let me know.

Totoro-Li commented 1 year ago

Yeah it'll be nice if certain key messages like initial system message can be retained from being truncated backward by token limit.

acheong08 commented 1 year ago

https://github.com/ztjhz/FreeChatGPT/blob/9f92cf5c70143f4c186af648db3c111c26d5cd98/src/api/api.ts#L34-L47

Here, you can copy the standard config into a new variable and set the new config's max_tokens as the original max_tokens minus the token count of messages

tjohnman commented 1 year ago

In some of my attempts to create a UI what I did was monitor the token usage and when it was 512 tokens from the limit, send a separate API request asking the model to summarize the conversation. Then I cleared the context and began a new conversation that only contained the summary as if the assistant had said it, beginning with "Conversation summary:". This was done before appending the latest user message and its response, so that these two were always kept separate from the summary.

I kept two message histories. The one I showed to the user was the complete one. And internally, there was the one that fit into the model's token, which was truncated using the method I explained above as needed.

I know this is overly complicated but maybe there are good ideas in it. Maybe we could do something similar but only summarizing the first half of the context, so that even if this means it needs to be summarized more often, there's also less contextual information lost each time.

I ended up ditching my UI and I keep using BetterChatGPT because it's full of useful features. The only think I miss is having long conversations that don't suddenly stop when the context is full.

ztjhz / BetterChatGPT

Feature Request: User friendly solution to token limit issue. #69