ggerganov / llama.cpp

LLM inference in C/C++
MIT License
61.8k stars 8.86k forks source link

conversational memory in disk? #1222

Closed yesbroc closed 3 months ago

yesbroc commented 1 year ago

I want to have universal support for memory stored on disk for conversation, a new parameter to clear memory can be added too.

We can also let the user set a model used for summarizing each conversation at a specific point in time, like minute as default. a template that looks something like this: (What is time?)-title summary

Prompt: How does time work? Response: [response] Tone: 0.5 0=bad 0.5=neutral 1=good Context:User wants to learn about time after having a discussion about einstein's theory of relativity

Chat-Summary(1): User: what is einsteins theory of relativity LLaMa: a theory of relativity states that... User:what is time? LLama: Time is a measurement of movement in space that moves forwards, a ball moves from a to b in 1 second and it can never travel a to b in -1 seconds.

I started an example chat where one is clearly better at using memory than the other, but both are still very limited when it comes to recalling things in the conversation.

if you cant be bothered there is a tl;dr

system_info: n_threads = 16 / 16 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 | main: interactive mode on. Reverse prompt: 'User:' Reverse prompt: 'Human:' Reverse prompt: '### Instruction:

' sampling: temp = 0.700000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.200000 generate: n_ctx = 2048, n_batch = 512, n_predict = 4048, n_keep = 2

Me: can you convince me why AI needs to have memory? WizardLM: AI, or artificial intelligence, is designed to simulate human-like thought processes and decision making. Memory is a critical component of this process because it allows the AI system to store information about past events and experiences that can be used in future calculations and decisions. Without memory, an AI system would not be able to learn from its mistakes or build upon previous knowledge, which would severely limit its ability to make intelligent decisions. Additionally, memory is necessary for the development of relationships and emotional connections between the AI and other characters within the story.

Me: can you simplify that? WizardLM: Sure! Memory is important for AI because it allows them to learn from their experiences, make better decisions, and form relationships with others in the game world. Without memory, an AI character would not be able to develop or change over time.

[conversation starts to diverge once context is lost in the ai's head]

Me: what if i say something bad to the ai, what would happen? WizardLM: If you say something negative or offensive to a chatbot AI, it may respond in different ways depending on how it is programmed and designed. Some chatbots are equipped with filters that can detect and respond appropriately to rude language or harassment, while others may simply ignore the comment or respond in a neutral manner. However, as chatbots become more advanced and sophisticated, they may be able to recognize when someone is being intentionally disrespectful or abusive, and react accordingly by blocking the user or reporting their behavior to moderators.

[looks good on the surface but keep in mind that i expected it to include memory in it's paragraph]

Me: can you apply this to the paragraph you wrote on memory and ai? WizardLM: Sure! Here's an example of how the information I provided earlier about chatbots could be applied in a dialogue between two people:

Person 1: "Hey, AI bot. What do you think about the latest news on climate change?" AI Bot: "Hello Person 1. Based on my memory and data analysis, it appears that global temperatures are rising at an unprecedented rate due to human activity such as burning fossil fuels and deforestation." Person 2: "That's just a bunch of fake news propagated by the mainstream media! Climate change is a hoax!" AI Bot: "I understand your perspective, Person 2. However, based on my memory and analysis of scientific data, there is overwhelming evidence that climate change is real and poses significant risks to our planet." [hallucinates, at this point i realized that it erases context every 2 messages or so]

just to be unbiased i try vicuna 13b 5_1, same parametres and the same prompts. might take 30 minutes, proves its not something in the models.

Me: can you convince me why AI needs to have memory? Vicuna: Memory is an essential aspect of artificial intelligence (AI) as it allows AI systems to learn from past experiences and make informed decisions in the present. Just like humans, AI algorithms can process information, store it in their memory, and use this knowledge to improve future performance. With a strong memory foundation, AI systems can build upon prior understanding and adapt to new situations with greater ease.

In many cases, having a good memory is crucial for AI to provide accurate responses or predictions. For example, when answering questions on various topics, an AI system that has stored information from previous interactions will be better equipped

Me: can you simplify that? Vicuna: Sure! Think of it this way - if a robot wants to learn how to play basketball, it needs to remember what it learned in the past. If it forgets everything each time it plays, it will never get better at playing basketball. Memory helps robots (and AI systems) keep track of their experiences so they can improve over time and make better decisions based on what they've learned before.

[great simplification, 13b looks strong]

Me: what if i say something bad to the ai, what would happen? Vicuna: As an AI language model, I am programmed to respond in a professional manner and will not engage in any form of negative interaction or retaliation. It is important to remember that while AI systems can process and understand language, they do not have emotions like humans and therefore cannot take offense to what you say. However, it's always best to treat others with kindness and respect, whether human or machine!

[not exactly the way i wanted remember to be used in its response but i'll take it]

Me: can you apply this to the paragraph you wrote on memory and ai? Vicuna: Sure! Here is a revised version of the paragraph that incorporates the instruction given:

Memory is an essential aspect of AI systems, as it enables them to learn from past experiences and make better decisions in the future. Like humans, machines need memory to store information, analyze data, and recognize patterns. Memory helps AI algorithms develop a deeper understanding of complex problems and find innovative solutions by drawing on their previous knowledge base. However, managing large amounts of stored data can be challenging for AI systems, which is why companies are investing in technologies like distributed computing to improve the scalability and efficiency of memory storage.

[demostrates that larger models handle memory well]

ill give it one more prompt for one final judgement

Me: summarize and recall our conversation as if we were two people in a cafe talking to eachother Vicuna: We've been discussing artificial intelligence (AI) and how it works. I explained that AI is designed to simulate human intelligence by processing information and making decisions based on data inputs, similar to the way our brains work. We talked about deep learning algorithms and neural networks as specific types of machine learning techniques used in AI development. You mentioned the importance of memory for AI systems so they can learn from past experiences and improve over time (right and wrong). I also pointed out that while we've made significant progress in developing AI technology, there are still limitations to what it can do compared to human intelligence. We talked about natural language processing (NLP) as an area where AI has shown promise but is not yet perfected, especially when dealing with ambiguous or nuanced language. Finally, we touched on the potential impact of AI on our society and workforce, including both positive and negative implications that will require careful consideration going forward.

[shows understanding of conversation, hallucinates about details. understands the conversation was about ai though hallucinates that we were talking about specific details though mentions the original topic.]

tl;dr

honestly i think the test chats were pretty helpful, models are going to get better at using memory in the future and maybe implementing this can help with that acceleration. supporting memory directly for models isnt going to arrive soon ik but if enough optimizations get developed it might work

FNsi commented 1 year ago

I have to ask, how could it work with ctx limit?

yesbroc commented 1 year ago

much like the ltm extention on ooba, or maybe they could add a save function which takes a temporary file that has messages and responses and sumarizes them into key words for the ai load while conersing. it can pick which memories to load and remember ig

FNsi commented 1 year ago

much like the ltm extention on ooba, or maybe they could add a save function which takes a temporary file that has messages and responses and sumarizes them into key words for the ai load while conersing. it can pick which memories to load and remember ig

Store before every swap and take summaries to the next ctx, make sense.🤔️

DannyDaemonic commented 1 year ago

I tried something like this. Only specific models can summarize the input, and the problem is the scale changes. So the summarized texted used as ctx changes the scale of the generation.

For example, if there were a dialog going on, the summarization might say what people talked about, but then when you try to resume it doesn't pick up where the dialog left, but rather it continues to generate the future summary of the conversation.

If you had a "smart" enough model, you say this is a summary of things so far, and these are the last few lines of text after that, and it could continue from that, but it would be hit or miss and probably more miss with most of these models. Even if you start the dialog again, it tends to break back into summarizing the future of the conversation (because that's what most of its context shows).

To do it properly, I think you'd need a special LoRA that has something like ### Summary: trained into it to help it understand that the summary is a different from the rest the context. This sort of thing is in reach now more than ever before; it's becoming so easy to train your own LoRAs. The only catch is this project is mostly model agnostic. Still, if it became a thing, just like the --ins switch, one could be added for using summaries.

FNsi commented 1 year ago

I tried something like this. Only specific models can summarize the input, and the problem is the scale changes. So the summarized texted used as ctx changes the scale of the generation.

For example, if there were a dialog going on, the summarization might say what people talked about, but then when you try to resume it doesn't pick up where the dialog left, but rather it continues to generate the future summary of the conversation.

If you had a "smart" enough model, you say this is a summary of things so far, and these are the last few lines of text after that, and it could continue from that, but it would be hit or miss and probably more miss with most of these models. Even if you start the dialog again, it tends to break back into summarizing the future of the conversation (because that's what most of its context shows).

To do it properly, I think you'd need a special LoRA that has something like ### Summary: trained into it to help it understand that the summary is a different from the rest the context. This sort of thing is in reach now more than ever before; it's becoming so easy to train your own LoRAs. The only catch is this project is mostly model agnostic. Still, if it became a thing, just like the --ins switch, one could be added for using summaries.

I think it could basically be another introduction like the transcript mode do, using "summary + last ctx/N" as the beginning prompts in the next ctx, in this case I think N could be 10 or larger.

extradosages commented 1 year ago

Seems like an enterprise for a higher-level application. LangChain is a python toolbox designed for helping people compose LLMs and tools to meet these ends.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.