cpacker / MemGPT

Letta (fka MemGPT) is a framework for creating stateful LLM services.
https://letta.com
Apache License 2.0
11.85k stars 1.29k forks source link

WIP - Condense configurations into conventions for Database (Metadatastore) Adapters #1460

Closed norton120 closed 2 months ago

norton120 commented 3 months ago

WIP

this may be a long-running branch since cutting the tests over to use httpx app + FastAPI dependency injection is gonna be a bit of work.

Preamble

The Database(s) that support the application state, agent memory (including vector lookup) and the application itself (user/org management, permissions, settings config etc) interface with the rest of the codebase via a MetadataStore object.

Goals here

norton120 commented 3 months ago

@yoaquim this is the working PR we were talking about

norton120 commented 3 months ago

K - thinking through where the complication that prevents us using the orm directly, it's really only the archive. So if we add accessors on the related objects, the adapter can probably obfuscate that complication. Something like

current_agent = authed_user.agents.get(agent_id)
# here's the magic
# archive_memory is not necessarily a sqlalchemy model
return current_agent.archive_memory.search(search_value)

In this case the adapter interface duck types as an orm - so with the pgvector adapter archive_memory is just a model, in SQLite it is a chroma wrapper.

norton120 commented 3 months ago

@cpacker @sarahwooders do you know if the init.sql file at the top level of the repo is for deployment? creating the initial user/password/db for the docker image would just be setting those envars

I'd like to create the test db in the docker db init, ideally, I'd like to not add a second init file and switch them around, so that's why I'm trying to track down what it is used for at the moment

norton120 commented 3 months ago

@cpacker @sarahwooders do you know if the init.sql file at the top level of the repo is for deployment? creating the initial user/password/db for the docker image would just be setting those envars

I'd like to create the test db in the docker db init, ideally, I'd like to not add a second init file and switch them around, so that's why I'm trying to track down what it is used for at the moment

For the moment I dumped into that init, overriding it without disturbing it is a bit of work. Can revisit before we start merging.

norton120 commented 3 months ago

OK. So the shortest path I can see from here is:

  1. add alembic migrations
  2. move to migration and connection instead of create_all (because that won't work anymore)
  3. overload the metadatastore methods to get parity - this should expose the chroma conflict naturally
  4. solve for chroma/pgvector as an overloaded model in the ORM
  5. get all tests passing, merge in all upstream changes
  6. delete all the dead code. there will be a lot. there already is.