This repo provides a simple example of memory service you can build and deploy using LanGraph.
Inspired by papers like MemGPT and distilled from our own works on long-term memory, the graph extracts memories from chat interactions and persists them to a database. This information can later be read or queried semantically to provide personalized context when your bot is responding to a particular user.
The memory graph handles thread process deduplication and supports continuous updates to a single "memory schema" as well as "event-based" memories that can be queried semantically.
├── langgraph.json # LangGraph Cloud Configuration
├── lang_memgpt
│ ├── __init__.py
│ └── graph.py # Define the agent w/ memory
├── poetry.lock
├── pyproject.toml # Project dependencies
└── tests # Add testing + evaluation logic
└── evals
└── test_memories.py
This quick start will get your agent with long-term memory deployed on LangGraph Cloud. Once created, you can interact with it from any API.
This example defaults to using Pinecone for its memory database, and nomic-ai/nomic-embed-text-v1.5
as the text encoder (hosted on Fireworks). For the LLM, we will use accounts/fireworks/models/firefunction-v2
, which is a fine-tuned variant of Meta's llama-3
.
Before starting, make sure your resources are created.
768
. Note down your Pinecone API key, index name, and namespac for the next step.Note: (Closed Beta) LangGraph Cloud is a managed service for deploying and hosting LangGraph applications. It is currently (as of 26 June, 2024) in closed beta. If you are interested in applying for access, please fill out this form.
To deploy this example on LangGraph, fork the repo.
Next, navigate to the 🚀 deployments tab on LangSmith.
If you have not deployed to LangGraph Cloud before: there will be a button that shows up saying Import from GitHub
. You’ll need to follow that flow to connect LangGraph Cloud to GitHub.
Once you have set up your GitHub connection, select +New Deployment. Fill out the required information, including:
langgraph.config
) and branch (main
)The default required environment variables can be found in .env.example and are copied below:
# .env
PINECONE_API_KEY=...
PINECONE_INDEX_NAME=...
PINECONE_NAMESPACE=...
FIREWORKS_API_KEY=...
# You can add other keys as appropriate, depending on
# the services you are using.
You can fill these out locally, copy the .env file contents, and paste them in the first Name
argument.
Assuming you've followed the steps above, in just a couple of minutes, you should have a working memory service deployed!
Now let's try it out.
The langgraph cloud deployment exposes a general-purpose stateful agent via an API. You can connect to it from a notebook, UI, or even a Slack or Discord bot.
In this repo, we've included an event_server
to listen in on Slack message events so you can talk with
your bot from slack.
The server is a simple FastAPI app that uses Slack Bolt to interact with Slack's API.
In the next step, we will show how to deploy this on GCP's Cloud Run.
So now you've deployed the API, how do you turn this into an app?
Check out the event server README for instructions on how to set up a Discord connector on Cloud Run.
Memory management can be challenging to get right. To make sure your schemas suit your applications' needs, we recommend starting from an evaluation set, adding to it over time as you find and address common errors in your service.
We have provided a few example evaluation cases in the test file here. As you can see, the metrics themselves don't have to be terribly complicated, especially not at the outset.
We use LangSmith's @test decorator to sync all the evalutions to LangSmith so you can better optimize your system and identify the root cause of any issues that may arise.