Open atroyn opened 1 week ago
Please tag your PR title with one of: [ENH | BUG | DOC | TST | BLD | PERF | TYP | CLN | CHORE]. See https://docs.trychroma.com/contributing#contributing-code-and-ideas
Please leverage this checklist to ensure your code review is thorough before approving
[!WARNING] This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite. Learn more
main
This stack of pull requests is managed by Graphite. Learn more about stacking.
Join @atroyn and the rest of your teammates on Graphite
Description of changes
This PR creates a path to migrating from previous versions of Chroma to the new version where we have collection configuration storage. The migration is idempotent and non-destructive.
Since all collections now must have a configuration, old collections would error when loading them - this was reflected in cross-version persistence failures.
With this approach, that doesn't happen. This is a first step to providing user-facing migration tooling. For now it's just this one script, but later as we add more of these, they can be composed in a more intelligent way.
This PR includes a new CLI application as part of the
chroma
CLI,chroma migrate
which will migrate all collections in a specifiedpath
(and optional tenant, and database), with ./chroma being the default.Test plan
Manual Test:
list_collections()
should fail with a JSON parsing error (since configurations don't exist)list_collections()
should work as expected.Automated:
test_cross_version_persist
passes locally and in CI.ALL TESTS Should pass by this point in the stack.
Documentation Changes
The migration and migration tool is documented at https://docs.trychroma.com/deployment/migration
Additionally, when a collection tries and fails to load a CollectionConfiguration from JSON, the error points the user to the same migration documentation.
TODO: