kingjulio8238 / Memary

Making Agents Reliable In Production.
https://www.memarylabs.com
MIT License
1.15k stars 82 forks source link

When Knowledge expands, wisdom shrinks #17

Closed imvetri closed 2 months ago

imvetri commented 2 months ago

Hello, I wanted to share my opinion.

The memary is based on knowledge graph expansion, though there is a compression in the pipleline, but in the knowledge graph, compression by data storage technique will not lead anywhere nor stay ahead of others.

Instead, by introducing a balance in the graph based on knowledge and wisdom, an efficient storage can be developed.

An example between knowledge is, knowing more is knowledge. Where as, wisdom is knowing less, yet be convincing. In terms of LLMs, for knowledge based communication, there will be huge number of words involved, where as, in wisdom based communication system, the number of words will be less, but stronger words.

image image image

kingjulio8238 commented 2 months ago

Hi @imvetri

Thank you for your insights. Regardless of the scale of the knowledge graph (the larger, the better), retrieving information always results in the formation of a subgraph centered around the main query entity, facilitating low-cost computation. This is highly beneficial for multi-hop reasoning. However, the knowledge graph itself does not undergo memory compression. The more comprehensive the graph, the greater the volume of accessible information for responding to queries.

The memory stream tracks users' exposure to various concepts - their 'wisdom' or breadth of knowledge - while the entity knowledge store assesses the depth of their understanding.

Currently, the only form of memory compression applies to the entity knowledge store, where we compress the top N entities. These are then provided to the LLM finite context window with specific instructions to avoid detailed explanations, enhancing personalization for users already familiar with these concepts.

Screenshot 2024-04-30 at 9 40 09 AM

We hope to see improvements in the memory compression implemented as its fairly basic. One future contribution listed in the README is that instead of compressing the most frequent entities (according to their counts), we could see improved results by compressing the entities that are included in the current query. This improvement will enable the LLM to more effectively determine which concepts require detailed explanations and which can be addressed more succinctly in response. We invite you to contribute to this development!

We also have compression for the chat history using an eviction policy but that's a bit outside the scope of this discussion. Hope this helps.

Screenshot 2024-04-30 at 9 41 01 AM