Better chat history management

rchan26 commented 11 months ago

With the llama-cpp model, after a few chat interactions, we may come across a ValueError('Requested tokens (...) exceed context window of 4096') error. Any messages after this will be responded with an AssertionError() (this is because in llama-index's messages_to_prompt function, it expects it to have alternating user and assistant chat messages and has assert statements to check this).

Note that we can avoid this by clearing the chat history by implementing a Slack shortcut (see #97), but maybe a better way to do this is that we start dropping old chat history to make sure there is enough space and not error out. Essentially have some automatic forgetting.

Maybe this change occurs in llama-index rather than here, but something to consider.

rchan26 commented 11 months ago

Related to #97

rchan26 commented 10 months ago

I think this should be fixed now with this PR into llama-index: https://github.com/run-llama/llama_index/pull/8530. Need to bump version of llama-index (I'll do this today)

rchan26 commented 10 months ago

we'll keep the ability to clear chat history just in case and also if the user wants to manually start a new conversation

alan-turing-institute / reginald

Better chat history management #101