kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.47k stars 875 forks source link

Upgrading Kedro is hard #3959

Closed astrojuanlu closed 1 week ago

astrojuanlu commented 1 week ago

Description

Our users find it difficult to upgrade Kedro in their projects. Just from a couple of recent user interviews:

User 1:

User 2:

There are also very clear signs that old Kedro versions tend to live on for a very long time:

image

People like @inigohidalgo have been reporting their long journey to upgrade from old Kedro versions in our public Slack.

And finally, we also have lots of internal evidence as well that big projects get stuck on old Kedro versions.

Is this a problem?

One could argue: "it it works, don't touch it". So the fact that Kedro is pinned to the latest version is not necessarily a bad thing.

However, with resource constrained teams maintaining many projects, each of them with slightly different versions of Kedro, this can become a mess to maintain.

For those teams who would wish to have a uniform Kedro versioning, we should provide a more clean upgrade path.

What has been done

For 0.19 we went ahead and added a detailed migration guide https://docs.kedro.org/en/latest/resources/migration.html#migrate-an-existing-project-that-uses-kedro-0-18-to-use-0-19

Where to go from here

However, it seems to not yet be enough.

What else can we do to make these migrations easier?

datajoely commented 1 week ago

Anecdotally our most sophisticated users are the ones that get stuck the most since they typically verge into complex hooks, dynamism and coupling to some of the internals. With 1.0.0 on the horizon hopefully these internals have stabilised and won't be as incompatible/painful going forward. This is my main hypothesis why I think some sort of automated tooling may not move the needle.

I'm still very much of the opinion that we need to build user-facing superpowers that make the effort to upgrade worth it. Introducing the settings.py in 0.18.x was a breaking change very much more important for the Kedro developers rather than Kedro users. Cynically one could argue we could do a better job making something like OmergaConf a 0.19.x, even if it technically could be made to work in a non-breaking way.

astrojuanlu commented 1 week ago

Oops closing this in favour of #3960 - will copy your comment over @datajoely 🙏🏼