Closed krassowski closed 4 days ago
To spell it out, default_max_chat_history
cannot be set to infinite as of today because it is defined as an Integer
(and math.inf
is a float) and even if it was a Float
it would later fail on BoundedChatHistory
expecting an int
. I think the solution here could be to treat None
as a special value. Thoughts?
I think the solution here could be to treat None as a special value. Thoughts?
That makes sense to me. 👍
None
is a good idea. jupyter lab --AiExtension.default_max_chat_history=2
) to also include a parameter for size of memory trail to summarize (or is everything beyind the default_max_chat_history
?). For the idea to summarize chunks of the history rather than carrying all of it [...] we need to extend the default setting instructions at startup [...] to also include a parameter for size of memory trail to summarize
Yes, that was my thinking too. Because there are couple of ways to implement compression, e.g.:
And each of these would have different set of parameters. I do not want to put too much compression logic into jupyter-ai to avoid making it hard to maintain, maybe let's have some simple default and allow swapping it out for something more advanced in extensions?
Problem
Previously the chat kept only two messages and that was hard-coded; with https://github.com/jupyterlab/jupyter-ai/pull/943 in we now have a
AiExtension.default_max_chat_history
setting which is great as it allows to increase the number from two to say 10. However context for a longer history will still be lost altogether. It is impossible to set memory to infinite, even if the model caches tokens.Proposed Solution
max_chat_history
get fed into the LLM asking it to compress say each 10 messages into a single summary message), orAiExtension.default_max_chat_history
(if the above solution is judged to be too complex, then the custom providers of models would be able to implement this on their side, provided that they receive all past messages)Additional context
Langchain has a dedicated example on how to implement summarization for chat history here:
https://python.langchain.com/v0.2/docs/how_to/chatbots_memory/#summary-memory