Closed rchan26 closed 10 months ago
Related to #97
I think this should be fixed now with this PR into llama-index: https://github.com/run-llama/llama_index/pull/8530. Need to bump version of llama-index (I'll do this today)
we'll keep the ability to clear chat history just in case and also if the user wants to manually start a new conversation
With the llama-cpp model, after a few chat interactions, we may come across a
ValueError('Requested tokens (...) exceed context window of 4096')
error. Any messages after this will be responded with anAssertionError()
(this is because inllama-index
'smessages_to_prompt
function, it expects it to have alternating user and assistant chat messages and hasassert
statements to check this).Note that we can avoid this by clearing the chat history by implementing a Slack shortcut (see #97), but maybe a better way to do this is that we start dropping old chat history to make sure there is enough space and not error out. Essentially have some automatic forgetting.
Maybe this change occurs in
llama-index
rather than here, but something to consider.