memgpt as a coding assistant: dogfood with MemGPT code

jimlloyd commented 9 months ago

Is your feature request related to a problem? Please describe. This may be more of an "epic" request.

My original use case for wanting to try out MemGPT was for helping me (and my teammates at my day job) to work with a very large C++ code base. I've concluded that won't be possible in the near term without first achieving some significant intermediate milestones.

So I want to propose this intermediate milestone: That we make it possible to load all of this repository (checked out to a specific commit) into archival memory and have the result be useful to current and future contributors to this project. In the industry this is often referred to as the colorful phrase "We eat our own dogfood", or simply "dogfooding our product".

Describe the solution you'd like After cloning the repo and installing dependencies, it should be possible to execute this command:

$ memgpt load directory --name memgpt --input-dir . --recursive

This currently fails because some docs are not chunked down to less than 512 tokens (see #755).

Once we have that working without errors, there may additional work before we find memgpt is able to assist with understanding the architecture/code and with implementing bug fixes and new features. It's hard for me to predict how much more work would be required then. But once we have achieved that it should open up the pool of potential contributors.

If this works well we should use it to add to the documentation!

Describe alternatives you've considered I'm probably going to try using GitHub CoPilot and Sourcegraph Cody with the MemGPT codebase, but want to use this issue to move MemGPT forward.

Additional context N/A

jhamify commented 9 months ago

@jimlloyd I'm currently looking at how this could be architected for use on extensive codebases as well. In terms of the UX, I'm thinking that MemGPT could facilitate the parsing of a document base to find documentation relevant to a query, and output an answer based on the passages found (i.e. 'walking' RAG, demoed here https://twitter.com/hrishioa/status/1734935026201239800).

I believe Hrishi uses document hierarchies and knowledge graphs with vector database retrieval as is outlined here: https://medium.com/enterprise-rag/a-first-intro-to-complex-rag-retrieval-augmented-generation-a8624d70090f.

Have you got any thoughts on the UX or on how this should be architected?

cpacker commented 7 months ago

@sarahwooders issue got auto-closed by #1098, not sure if intended (will reopen can close if so)

cpacker / MemGPT

memgpt as a coding assistant: dogfood with MemGPT code #766