Open nebuchadneZZar01 opened 1 year ago
The problem is related to the pre-trained weights of the model, but mostly because the vector-store stores the state-of-the-union embeddings and those are directly used for the inference, just like a classification/recommendation system. If only the right source_documents related to the context of the chat and of the questions in general are used, the final results of the generated output + sources cited will be better.
While the problem of the ConversationMemoryBuffer are related to the fact that it's not the very best way to retain a state between different messages. We should probably find some alternatives or a way of using it without its downsides.
Describe the bug and how to reproduce it When giving long answers or after a great number of messages' exchanges, the LLM model memory buffer goes in error as the token number is too big for the context window
Expected behavior The LLM model should remember the content of the conversation and answer correctly