oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.61k stars 5.31k forks source link

Long message history slows down chatbot's response time #5566

Closed Galaxia-mk closed 6 months ago

Galaxia-mk commented 8 months ago

When loading in a big history file, in my case roughly 600kb, the messages generate as slow as about 1 minute per message. It isn't the context because lowering the context doesn't change anything and trimming down the card to nothing also doesn't change the speed. It took me hours to figure out what was causing it. I think it's strange that it works independently from context length. Could you please fix this Ooba? Maybe it takes the entire history into account unnecessarily. In the beginning of the chat the message speed can be near instant while over time it gets slower and slower until it goes at a snails pace. I'd really like to be able to keep my message history with the bot without having to restart it. At the moment when talking to my bot it feels like there is a ticking clock until it becomes unusable Is this considered a bug or just a flaw in the way Oobabooga works?

github-actions[bot] commented 6 months ago

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.