MI-DPLA / combine

Combine /kämˌbīn/ - Metadata Aggregator Platform
MIT License
26 stars 11 forks source link

introduce semantic version specific updates to run with update management command #357

Closed ghukill closed 5 years ago

ghukill commented 5 years ago

COMBINE_VERSION is added in version v0.4 to combine.settings, which introduces the possiblity of comparing versions, and acting on those version changes when the management command update is invoked.

For example, proposed that in the changes fromv0.3.3 --> v0.4, the job_details['transformation'] structure changed for Transform Jobs changes to support multiple transformations for a single Job. This is a great improvement, but comes at the cost of changing the loosely defined data model of job_details, namely that it was only ever concerned with a single, possible Transformation Scenario. This has effects for front-end display, exporting/importing, and most importantly, re-running in Spark, if different versions are encountered.

But with versions, particularly if these version specific commands are run in a particular order, it can be deduced that a Combine instance without COMBINE_VERSION, or one that is less than v0.4, needs to have the job_details updated (the single Transform merely becomes the first of a list of length one).

The packaging library provides means to compare semantic version strings, which is perfect here, e.g.:

In [1]: from packaging import version

In [2]: version.parse(settings.COMBINE_VERSION)
Out[2]: <Version('0.3.3')>

In [3]: version.parse(settings.COMBINE_VERSION) > version.parse('v0.5')
Out[3]: False

In [4]: version.parse(settings.COMBINE_VERSION) < version.parse('v0.5')
Out[4]: True

This is undoubtedly an established pattern, and feels quite a bit like ordered db migrations, so work will be to see where these version comparison specific snippets are stored. They are permanent, so perhaps in management folder?

ghukill commented 5 years ago

Scaffolding will be present in v0.4, under new VersionUpdateHelper class in update.py management command.

Preliminary example is updating all < v0.4 Transform Jobs to multiple transformations job_details.

Closing.