microbiomedata / nmdc-runtime

Runtime system for NMDC data management and orchestration
https://microbiomedata.github.io/nmdc-runtime/
Other
4 stars 3 forks source link

Migration notebooks require documents to have `id` field (note: `functional_annotation_agg` documents lack one) #431

Closed eecavanna closed 5 months ago

eecavanna commented 6 months ago

The migration notebooks were written under the assumption that each document has an id field.

https://github.com/microbiomedata/nmdc-runtime/blame/8e06922d55db261c28a855ea5624011370804fe0/demo/metadata_migration/notebooks/migrate_9_1_0_to_9_2_0.ipynb#L472-L474

A teammate recently pointed out to me (here) that the documents in the functional_annotation_agg collection do not have such a field.

Whether that's a "flaw" of that collection or a "flaw" of the migration notebook design — the migration notebook wouldn't work with the documents of that collection as they exist today (if a migration were to involve that collection).

eecavanna commented 5 months ago

This won't be an issue with the notebooks that will be used to migrate to nmdc-schema v9.4.0 and later.