pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.62k stars 1.09k forks source link

Revisiting Xarray's Minimum dependency versions policy #7765

Open jhamman opened 1 year ago

jhamman commented 1 year ago

What is your issue?

We have recently had a few reports expressing frustration with our minimum dependency version policy. This issue aims to discuss if changes to our policy are needed.

Background

  1. Our current minimum dependency versions policy reads:

    Minimum dependency versions

    Xarray adopts a rolling policy regarding the minimum supported version of its dependencies:

    • Python: 24 months (NEP-29)
    • numpy: 18 months (NEP-29)
    • all other libraries: 12 months

    This means the latest minor (X.Y) version from N months prior. Patch versions (x.y.Z) are not pinned, and only the latest available at the moment of publishing the xarray release is guaranteed to work.

    You can see the actual minimum tested versions:

    pydata/xarray

  2. We have a script that checks versions and dates and advises us on when to bump minimum versions.

    https://github.com/pydata/xarray/blob/main/ci/min_deps_check.py

Diagnosis

  1. Our policy and min_deps_check.py script have greatly reduced our deliberations on which versions to support and the maintenance burden of supporting out dated versions of dependencies.
  2. We likely need to update our policy and min_deps_check.py script to properly account for Python's SEMVER bugfix releases. Depending on how you interpret the policy, we may have prematurely dropped Python 3.8 (see below for a potential action item).

Discussion questions

  1. Is the policy working as designed, are the support windows documented above still appropriate for where Xarray is at today?
  2. Is this policy still in line with how our peer libraries are operating?

Action items

  1. There is likely a bug in the patch-version comparison in the minimum Python version. Moreover, we don't differentiate between bugfix and security releases. I suggest we have a special policy for our minimum supported Python version that reads something like:

    Python: 24 months from the last bugfix release (security releases are not considered).


xref: https://github.com/pydata/xarray/issues/4179, https://github.com/pydata/xarray/pull/7461

Moderators note: I suspect a number of folks will want to comment on this issue with "Please support Python 3.8 for longer...". If that is the nature of your comment, please just give this a ❤️ reaction rather than filling up the discussion.

keewis commented 1 year ago

Is the policy working as designed

yes, it is: we guarantee support for at least 24 months, and only drop support once there's another version of python that was released more than 24 months ago. For example, python 3.8 was initially released on Oct 14, 2019 and python 3.9 was released on Oct 5, 2020. According to our policy we were able to drop python 3.8 for releases after Oct 5, 2022, since that's when python 3.9 was released 24 months ago.

This works very well for infrequent releases, since it guarantees that we don't accidentally require a very new version immediately after its release. However, these admittedly a bit complicated rules make interpreting the policy a bit more challenging than a simple "X months from this release" would for projects with frequent releases. Maybe we should add a (automatically created) support table for the core dependencies to the installation guide to make reasoning about the policy easier?

Python: 24 months from the last bugfix release (security releases are not considered).

That would make the support window less predictable, since the python devs might consider an additional bugfix release depending on the situation (there's a reason why the release peps say, emphasis mine: "X will receive bugfix updates approximately every 2 months for approximately 18 months"). Instead, maybe we should extend the support for python versions by about 6 months, to a total of 30 months? That would effectively align us with NEP-29, which is our upper limit anyways since that's what our dependencies follow (even if their releases don't usually happen at exactly that date).

And before anyone claims we're dropping support for a python version just because our policy tells us to: I'm excited about a number of changes to python, like the dict union and removeprefix / removesuffix in 3.9, the union types in 3.10, and the exception groups in 3.11, so really there is a compelling reason to upgrade as soon as the policy allows for each release.

jhamman commented 1 year ago

@keewis - thanks for the clarifications on the the version policy related to Python 3.8. Very helpful.

Instead, maybe we should extend the support for python versions by about 6 months, to a total of 30 months? That would effectively align us with NEP-29, which is our upper limit anyways since that's what our dependencies follow (even if their releases don't usually happen at exactly that date).

This is an interesting proposal. Worth considering.

MuellerSeb commented 1 year ago

What about dependencies like pandas? Maybe it could be a good idea to synchronize policies to avoid conflicts there.

As written before, xarray is currently broken when installing it with pip install xarray on Python 3.8 (default version on Ubuntu 20.04 (2 years of support still) for example), since Pandas 2 will be installed as a dependency.

I now need to add this to the dependency list of my packages depending on xarray:

pandas<2; python_version=='3.8'

Maybe this should be documented somewhere. But it is still inconvenient.

MuellerSeb commented 1 year ago

Pandas seems to also look at the policies of its dependencies like Numpy: https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table

See: https://github.com/pandas-dev/pandas/issues/52513

jhamman commented 1 year ago

Instead, maybe we should extend the support for python versions by about 6 months, to a total of 30 months? That would effectively align us with NEP-29, which is our upper limit anyways since that's what our dependencies follow (even if their releases don't usually happen at exactly that date).

This seems like a good action item to come from this. And seems to align with the thrust of #7777.

keewis commented 1 year ago

In the meeting today we decided to both change the min-versions policy for python to 30 months, and to add a release table (something like the one in NEP29, but automatically created with each newly released version – possibly through a polling action that runs once a week or something so we don't increase the docs build time any more). The reason for the latter is that the policy, while a good policy, is not easy to understand, and its full meaning can only be inferred by reading the min_versions_policy script.

jhamman commented 1 year ago

@keewis - you are probably the best person for this task. Can you take on updating our min_deps_check.py script?

keewis commented 1 year ago

reopening so I can keep track of the second task: creating and automatically updating the support table

keewis commented 1 year ago

SPEC0 has a nice script that creates a visualization of the supported versions using the versions available on PyPI, we can probably reuse some of their code.

scottyhq commented 6 months ago

I was doing some minimum version housekeeping in a few repositories and notice here it's time to drop 3.9 according to the current script.

Package           Required             Policy               Status
----------------- -------------------- -------------------- ------
python            3.9     (2020-10-05) 3.10    (2021-10-04) <

@keewis I see Xarray endorses the SPEC0 which states:

Support for Python versions be dropped 3 years after their initial release.
Support for core package dependencies be dropped 2 years after their initial release.

As it happens I'll be participating in this upcoming scientific-python meeting in a couple of weeks and could work on porting that version checking code (and making any suggested modifications that would suit people's needs). https://scientific-python.org/summits/developer/2024/ . Thoughts?

dcherian commented 6 months ago

We're holding on dropping 3.9 at the moment so that we can make a release that's both 3.9 and numpy 2.0 compatible to prevent some user frustration.

Once that's done, we can drop 3.9

I'll be participating in this upcoming scientific-python meeting in a couple of weeks and could work on porting that version checking code

Sounds great!