elyra-ai / elyra

Elyra extends JupyterLab with an AI centric approach.
https://elyra.readthedocs.io/en/stable/
Apache License 2.0
1.86k stars 344 forks source link

Move the pipeline migration code from the front end to a backend API #1923

Open ajbozarth opened 3 years ago

ajbozarth commented 3 years ago

Is your feature request related to a problem? Please describe. Currently the CLI doesn't support migrating pipelines and instead shows an error message telling a user to use the UI to migrate.

Describe the solution you'd like We should move the migration functionality from the pipeline-services package to a backend API call that the CLI could also leverage.

ajbozarth commented 3 years ago

I would like to tackle this myself once we decided what milestone we want to work on it.

ajbozarth commented 2 years ago

Per the discussion on https://github.com/elyra-ai/elyra/discussions/2550#discussioncomment-2516686 and at the April 7th dev meeting we will be implementing this in 4.0 alongside a new policy detailed below:

On release of a major version of Elyra we will increment the pipeline version (if no actually migration is needed with the release it will be a no-op). Then starting at that major version release migration will only be supported between the previous pipeline version and that new version. If a user want to migrate from an older version they will be prompted to checkout the latest release of the previous major version and run migration there first.

For 4.0 in particular we will also support migration from pipeline version 3, the latest version used in Elyra 1.X and 2.X in addition to the policy described above. This will help prevent users from needing to checkout and run the UI in order to do migration since starting with Elyra 4.0 there will be a CLI migration tool as well (https://github.com/elyra-ai/elyra/issues/2647). As described in https://github.com/elyra-ai/elyra/discussions/2550#discussioncomment-2516686 the v3 to latest migration is small and would not require much code on the backend.

This would also allow us to adopt using migration on the backend without the need to move all of our current migration code.

I am also open to edits in the wording of the above policy for clarity (since we will add it to the docs when it's implemented).

kevin-bates commented 2 years ago

Thanks Alex, here's a possible edit to the policy (to which I have no affinity - so edit as necessary)...

On the release of each major version of Elyra the pipeline version number will be unconditionally incremented (even when its migration is not warranted). Then, starting at that major version release, a pipeline's migration will only be supported from the immediately-previous major version of Elyra (e.g., Elyra version 3.x to Elyra version 4.x). Users needing to migrate pipelines from a version of Elyra more than one major version prior (e.g., Elyra version 2.x to Elyra version 4.x) will be prompted to install the latest release of the immediately-previous major version (e.g., Elyra 3.x) and run migration there first.

ajbozarth commented 2 years ago

immediately-previous major version of Elyra

Not the immediately-previous version of elyra but of pipeline version. For instance 3.0 supports pipeline v4 but 3.4 supports pipeline version v7. We would only support migration from v7 not v4 (the last version of pipeline supported within the 3.X lifecycle)

Edit: this difference may be worth introducing minor pipeline versions and incrementing the major version each major release of elyra, but that can be discussed when we actually go to implement this closer to 4.0 sprint

kevin-bates commented 2 years ago

I see - that's complicated because it's less obvious to a user whether or not a pipeline version has been incremented. So is this policy a one-time thing that only applies to the Elyra 3.x to Elyra 4.x transition? Or will always hold? What about intra-release pipeline migrations? For example, consider the following Elyra to Pipeline version map:

Elyra 4.0::Pipeline v8
Elyra 4.1::Pipeline v9
Elyra 4.2::Pipeline v10
Elyra 4.3::Pipeline v10
Elyra 4.4::Pipeline v11
Elyra 4.5::Pipeline v11
Elyra 4.6::Pipeline v11
Elyra 4.7::Pipeline v11
Elyra 4.8::Pipeline v12
Elyra 4.9::Pipeline v12
Elyra 4.10::Pipeline v12
Elyra 5.0::Pipeline v13

Per the policy, users on Elyra < 4.8 (including those on 1.x, 2,x, and 3.x) that want to go to Elyra 5.0 must first pip install --upgrade elyra<5 (i.e., first move to 4.10) to bring their pipelines up to v12, then pip install --upgrade elyra>=5 to get their pipelines to v13 (even when there's no migration that happens). Yet users that happen to go from Elyra v4.0 to Elyra v4.7 can take their pipelines from v8 to v11 via a single upgrade (as one would expect).

This seems like we're going to need to maintain some kind of Elyra to Pipeline version map in the code to determine when this policy must be enforced (i.e., when to require the immediately-previous pipeline version) vs. letting pipeline migrations happen purely based on the pipeline's version (independent of Elyra's version) and feels a little arbitrary to me.

I figured this was more of a policy that can be based on (and expressed by) Elyra versioning only - which is what the user sees (and understands).

ajbozarth commented 2 years ago

So is this policy a one-time thing that only applies to the Elyra 3.x to Elyra 4.x transition? Or will always hold?

The intention is for this to be for all future major version bumps, but we could support all versions released in the previous major cycle if we want to in the future, we just wanted to avoid moving all the current migration code to the backend this time around.

This seems like we're going to need to maintain some kind of Elyra to Pipeline version map in the code to determine when this policy must be enforced

This was what I was hinting towards with my edit about minor versions above. We could introduce minor versions of pipelines starting in Elyra 4.0 and only bump the major version in sync with elyra. So in your example v9 == v8.1, v10 == v8.2, v11 == v8.3, v12 == v8.4, and v13 == v9.0.

The two above ideas (support of all previous cycle releases and using minor versions) would be able to be adopted individually or together if we want to use them in addition to the previous policy decisions. But even if we adopt them I believe we should stick to only supporting v3->v8 and v7->v8 migration for 4.0 as we transitions to a backend migration service.

kevin-bates commented 2 years ago

For the curious, could you please provide a link to the current pipeline migration code?

ajbozarth commented 2 years ago

For the curious, could you please provide a link to the current pipeline migration code?

https://github.com/elyra-ai/pipeline-editor/tree/master/packages/pipeline-services/src/migration