microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
26 stars 8 forks source link

`berkeley-schema-fy24`: Implement super migrator that runs all partial migrators in correct order #2064

Closed eecavanna closed 2 weeks ago

eecavanna commented 2 weeks ago

The task here is to implement the following (copy/pasted from Slack):

I've had an idea floating around in my head for the past couple days, which I want to share.

It is to make a migrator that instantiates and runs the various Berkeley migrators in the correct order. In that sense, this migrator would be a "meta" migrator (although its API to the outside world would be the same as any other migrator—the caller would instantiate it with an adapter and then call its upgrade method).

That would make it so that that order is codified somewhere in the same repo as the migrators are implemented.

On the migration notebook side (as a reminder, those notebooks live in a different repo), the notebook would invoke that "meta" migrator instead of each of its constituent migrators.

Finally, I would move the Berkeley migrators into a subdirectory. That would make it so that the migrators directly in the nmdc_schema/migrators directory (not subdirectories) once again each migrate a database from one schema with a distinct version number to another schema with a distinct version number (as opposed to a mixture of that, and migrators that migrate a database from "one PR to another PR").

eecavanna commented 2 weeks ago

The PR is ready for review/merge.

https://github.com/microbiomedata/berkeley-schema-fy24/pull/205