vega / altair

Declarative statistical visualization library for Python
https://altair-viz.github.io/
BSD 3-Clause "New" or "Revised" License
9.26k stars 792 forks source link

Deprecation/breaking changes policy & process #3454

Closed dangotbanned closed 2 months ago

dangotbanned commented 3 months ago

Note

This issue is focused on how future changes are made and how they are communicated both on release and at runtime. Any specific reference is used only as an example.

Motivation

I've been thinking about this since #3426, which dealt with removing a 4-year old deprecation, but there are some other cases where it could have been helpful:

I believe this would benefit both users and contributors:

Open Questions

There are likely many topics that could be discussed, but to get the ball rolling:

PRs

Examples from other projects

polars

pandas

binste commented 3 months ago

Thanks for opening the issue! This has definitely been an issue a few times already. The pandas policy you linked to sounds well thought out to me. We could just take that one? The polars policy as well but I'd prefer if we lean towards something shorter, especially to start out with.

So far, major versions in Altair coincided with major versions in Vega-Lite. I think it would also be ok to break with this. Sure, whenever there is a major version release in VL, we probably need one in Altair as well as it might include breaking changes, but we could also think about doing a version 6 at one point with VL still being at 5 and the breaking changes would then only be about API stuff on the Altair side.

Regarding cleaning up the public API, #2918 is related to this. I'm not sure how much of the API we can change, even in a major release, without making the costs for users too high. But i think it's worth exploring if we can move some lesser-used features of Altair to submodules and not import them up all the way to the top-level package namespace. It would help with docs, autocompletion, and just the general user experience.

dangotbanned commented 3 months ago

Regarding cleaning up the public API, https://github.com/vega/altair/issues/2918 is related to this.

Thanks for linking that issue @binste, I hadn't seen it before

I'm not sure how much of the API we can change, even in a major release, without making the costs for users too high.

I think the polars Upgrade guides are a good model to follow here. It also mentions @MarcoGorelli's polars-upgrade.

So far, major versions in Altair coincided with major versions in Vega-Lite. I think it would also be ok to break with this. Sure, whenever there is a major version release in VL, we probably need one in Altair as well as it might include breaking changes, but we could also think about doing a version 6 at one point with VL still being at 5 and the breaking changes would then only be about API stuff on the Altair side.

I agree with not tying major versions to upstream. Looking at the past releases the highest has been v2.4.1, so another option for this could also be bumping the minor release differently like:

Besides maintaining the status quo of major versions aligning with vega-lite, this could open a path to handling transitional deprecations more swiftly - like in alt.param.

https://github.com/vega/altair/blob/7f05ecbf999766ae1398dcbeb6144c128413026c/altair/vegalite/v5/api.py#L448-L482

The above is proposed only as a solution for maintaining major versions with vega-lite @dangotbanned

I agree with not tying major versions to upstream.

Other related

MarcoGorelli commented 3 months ago

Thanks for the ping!

I think the stance is pretty much the same but using fewer words

The main pandas vs Polars difference that I'm aware of is about when deprecations are enforced. Suppose a deprecation is introduced in version 10.1. Then:

Having said that, the amount of time between major releases is expected to be lower for Polars

dangotbanned commented 3 months ago

Thanks for the ping!

I think the stance is pretty much the same but using fewer words

The main pandas vs Polars difference that I'm aware of is about when deprecations are enforced. Suppose a deprecation is introduced in version 10.1. Then:

* Polars would remove it in version 12

* pandas in version 11

Thanks for the clarification @MarcoGorelli, I've updated the description.

From the version tagging I've done in #3455, the only non v5.0.0 deprecation I found was api.TopLevelMixin.serve - at v4.1.0. So that point would be scoped to that deprecation alone.

https://github.com/dangotbanned/altair/blob/fe1917e3d4986dfa431f428146819622b83b6546/altair/vegalite/v5/api.py#L2659

dangotbanned commented 2 months ago

Regarding cleaning up the public API, https://github.com/vega/altair/issues/2918 is related to this. I'm not sure how much of the API we can change, even in a major release, without making the costs for users too high.

@binste this is a recent example from polars in https://github.com/pola-rs/polars/pull/17282/files, that could be a possible path forward:

https://github.com/pola-rs/polars/blob/c8e4822ac86a9af0dd79a79c7e98def4458f18e6/py-polars/polars/type_aliases.py

from typing import Any

import polars._typing as plt
from polars._utils.deprecation import issue_deprecation_warning

def __getattr__(name: str) -> Any:
    if name in dir(plt):
        issue_deprecation_warning(
            "The `polars.type_aliases` module is deprecated."
            " The type aliases have moved to the `polars._typing` module to explicitly mark them as private."
            " Please define your own type aliases, or temporarily import from the `polars._typing` module."
            " A public `polars.typing` module will be added in the future.",
            version="1.0.0",
        )
        return getattr(plt, name)

    msg = f"module {__name__!r} has no attribute {name!r}"
    raise AttributeError(msg)

AFAIK, when combined with explicit imports on the top level - this would also solve https://github.com/vega/altair/pull/2927

Existing code would continue to work (w/ a deprecation warning) but autocomplete wouldn't display those that are only accessible via the __getattr__.

binste commented 2 months ago

So far, major versions in Altair coincided with major versions in Vega-Lite. I think it would also be ok to break with this. Sure, whenever there is a major version release in VL, we probably need one in Altair as well as it might include breaking changes, but we could also think about doing a version 6 at one point with VL still being at 5 and the breaking changes would then only be about API stuff on the Altair side.

I agree with not tying major versions to upstream. Looking at the past releases the highest has been v2.4.1, so another option for this could also be bumping the minor release differently like:

  • v4.2.2 - Patch
  • v5.0.0 - Upstream/schema change (vega/vega-lite)
  • v5.1.0 - Feature
  • ...
  • v5.3.0 - Feature
  • v5.10.0 - altair API breaking (but still within v5 schema)
  • v5.10.1 - Patch
  • v5.11.0 - Feature
  • v6.0.0 - Upstream/schema change (vega/vega-lite)

Besides maintaining the status quo of major versions aligning with vega-lite, this could open a path to handling transitional deprecations more swiftly - like in alt.param.

Interesting proposal! The more I think about this, the more I see the value in sticking with semantic versioning as it's convenient to use in version restrictions and it's what users know from many other packages. I think this also means that we should not tie VL and Altair major versions together to be more flexible with evolving the API. Most users are probably not aware what Vega-Lite is and don't have to be and so there is no value in it for them to have major versions in sync.

Let's see if anyone else wants to weigh in here. If you want to draft a short policy for the docs section, that's of course very much appreciated and we could also continue the conversation in that PR.

joelostblom commented 2 months ago

While I think there is some convenience with having the Altair and Vega-Lite versions match up, I agree that the advantages of sticking to semantic versioning and releasing breaking changes in Altair when they are ready are probably bigger benefits, so I'm also OK with breaking the VL version mapping.

i think it's worth exploring if we can move some lesser-used features of Altair to submodules and not import them up all the way to the top-level package namespace. It would help with docs, autocompletion, and just the general user experience.

Big +1 from me!

MarcoGorelli commented 2 months ago

it's worth exploring if we can move some lesser-used features of Altair to submodules and not import them up all the way to the top-level package namespace

Agree - for reference, Polars saw quite noticeable import time improvements from not loading less-used submodules upfront

Side note on import times, which I hadn't noticed until this issue spurred me to check: since #3452, python -c 'import altair' has gone from 0.606s to 0.344s (from running time python -c 'import altair' on the command line and taking the lowest value for "real" from 7 runs) 🎉

dangotbanned commented 2 months ago

Thanks for all the input! @binste @joelostblom @MarcoGorelli Great to hear everyone so far seems in agreement on the possibility of altair==v6.0.0 - to address breaking changes - without requiring vega-lite==v6.0.0.

I'm planning to move I moved this out of draft today.

While it wasn't opened to directly address this issue, it is quite related, and I think a good initial boundary test of what everyone considers breaking. By that I mean I'm happy to revert things for perfect compatibility, but any feedback during review will likely help shape the policy itself.

As mentioned in https://github.com/vega/altair/discussions/3478#discussioncomment-10059436 I'm aiming to get a policy PR up soon I have an early draft policy PR https://github.com/vega/altair/pull/3488 up, with the goal of having clear guidance distributed alongside the v5.4.0 release notes