edwardzjl / chatbot

A simple, multi-user, multi-conversation, web-based chatbot.
https://chatbot.agi.zjuici.com
6 stars 2 forks source link

Enhance chat memory #226

Open edwardzjl opened 10 months ago

edwardzjl commented 10 months ago

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Enhance chat memory by long-term memo (summary), searched memo (vector retriever) and short-term memo (buffer window, current solution), and combine them together (CombinedMemory).

edwardzjl commented 10 months ago

Before utilizing any token-counting memory, such as langchain.memory.ConversationSummaryBufferMemory, customization is required.

These memories rely on the functionality of langchain_core.language_models.base.BaseLanguageModel.get_num_tokens_from_messages to calculate the input token length. However, this calculation is contingent on langchain_core.messages.get_buffer_string, which may not accurately represent the string in certain scenarios, such as when using chat templates like chatml.

Additionally, the langchain_core.language_models.base.BaseLanguageModel.get_num_tokens_from_messages function invokes langchain_core.language_models.base.BaseLanguageModel.get_token_ids, which defaults to GPT2TokenizerFast.from_pretrained("gpt2"). It's important to note that this default setting could lead to inaccuracies if I'm using another LLM instead of ChatGPT.

edwardzjl commented 10 months ago

I digged a little into vector based memories, here's some pickups:

edwardzjl commented 10 months ago

What about a background task (like kubernetes CronJob) to read all memories, and put those "old ones" into a vectorstore?

Neither is simple to implement.

edwardzjl commented 10 months ago

A chatbot is not like a human been. Humans don't inherently retain conversation message orders; instead, we rely on short-term memory, which eventually consolidates information into a long-term memory where order becomes less crucial.

In contrast, a chatbot must persistently maintain all messages in a list-order, primarily for later user display.

Additionally, it's worth noting that Redis indexing is limited to hash or JSON structures, not lists. I cannot simply add embedding index on list of chat messages.

To address the need for both long-term and short-term memory while preserving the correct order of messages for user display, a solution involves duplicating the messages. One copy is stored in a Redis list, and another is stored in a hash with an embedded index.

An alternative approach might involve storing only the list index in the document, as shown below:

{
  "msg_embedding": [],
  "msg_idx": 0
}

Fetching messages with context can then be achieved by accessing the message list using LRANGE, which has an acceptable time complexity of O(S+N) (LRANGE).

However, it's important to note that the current behavior of langchain's langchain_community.chat_message_histories.redis.RedisChatMessageHistory, which stores messages in reverse order using LPUSH (placing the latest message at the beginning of the list), necessitates a modification. This adjustment is crucial to maintain consistency, as each new message alters the index of all existing messages.

edwardzjl commented 8 months ago

langchain's memory system is under refactoring, maybe I will wait some more weeks until it's getting stable.