Open benjamincburns opened 3 weeks ago
Thanks for flagging and for the detailed writeup - we can make a minor bump with breaking changes.
Easy enough. I'll see if I can't submit a PR later today my time.
Easy enough. I'll see if I can't submit a PR later today my time.
Yeah, that was wildly optimistic on my part. I'm going to get the tests into PR tonight with exclusions for the issues that I found while writing them (just raised the last of those), and then I'll circle back and submit some more PRs for those issues. 😅
For #541 I decided the best course of action for testing CheckpointSaver
list
implementations would be to write a combinatorial test that spamslist
with a bunch of different argument combinations. ForMongoDBSaver
, every test that involved specifying values underoptions.filter
failed.The
options.filter
argument appears to be meant to constrain the result based on values in the storedCheckpointMetadata
. TheMongoDBSaver
attempts to do this here, with the following code:Unfortunately because of how
checkpoint
andmetadata
are serialized in theput
method (the output ofthis.serde.dumpsTyped
is of type[string, UInt8Array]
),metadata
is stored as a BSON Binary field, so any timeoptions.filter
is passed in,list
yields no results.At first glance you might think that this would be fixable without a breaking change by applying the
filter
in the client (in thefor (const doc of result)
loop). That would work, but only if you also applied thelimit
constraint client-side as well. Otherwise any time alimit
constraint is applied, you'll likely miss items that would've matched your filter.The same problem exists on the
SqliteSaver
as well, however that's solvable without a breaking change by joining the query with a CTE that deserializes the storedmetadata
object (will PR that change tomorrow). I'm not a MongoDB aficionado, but I don't think it has facilities for transforming documents in any way prior to applying filters.The simple fix is to persist the serialized fields as objects nested into the document object, rather than byte arrays. Unfortunately, doing this without a migration will break existing collections of checkpoint history.
At present there's no mechanism baked into either the
langgraph-checkpoint
library or theMongoDBSaver
to check for out of date document structures, or to define or kick off migrations like these, so the design of this will likely require a bit of care. It would be easy enough to write an automatic migration that runs on start, but that gets ugly in production environments when there are multiple concurrent processes. As a result, I'd recommend making it so the updatedMongoDBSaver
fails on initialization if the migration is required for an existing db. That way the migration can be applied out of band, and there's no risk that someone will blow away their checkpoint storage as a result of annpm update
.For now I've pruned the
filter
checks in mylist
tests in order to prevent this from blocking progress on #541.