python / cpython

The Python programming language
https://www.python.org
Other
62.34k stars 29.94k forks source link

Discourage and deprecate `typing.AnyStr` #105578

Open AlexWaygood opened 1 year ago

AlexWaygood commented 1 year ago

Feature or enhancement

We should discourage and deprecate typing.AnyStr.

Pitch

typing.AnyStr is bad for many reasons:

  1. The name implies that it has something to do with the type Any. It has nothing to do with the type Any.
  2. The name implies that it means "any string". It does not mean "any string".
  3. AnyStr is a TypeVar, but the name does not follow the common naming convention for TypeVars (using a "T" suffix). Many users appear to think that it is equivalent to str | bytes, which is incorrect.
  4. AnyStr is the only type variable that is publicly exported from the typing module. Unusually, it is a constrained type variable. Constrained type variables are usually not what users want for modern APIs. Bound type variables, in general, have more intuitive semantics than constrained type variables.
  5. One of the motivations for PEP-695 (accepted by the Steering Council, and now implemented) was the fact that reusable type variables can be confusing in terms of their scope. In general, I believe the consensus of the typing community is that using PEP-695 syntax for creating type variables clarifies the scope of type variables and makes them more intuitive for users. As such, we should discourage using reusable TypeVars such as AnyStr.

For all of these reasons, AnyStr is very commonly misused, especially by typing beginners. We get many PRs at typeshed that misuse AnyStr, and it can often be hard to catch these misuses in CI (careful manual review is required).

Therefore, we should discourage and deprecate typing.AnyStr. Unfortunately, it is very widely used, so the deprecation period will have to be a long one.

I propose the following plan:

  1. Clarify the docs for typing.AnyStr. Explain more clearly the differences between AnyStr and a union; give examples of uses of AnyStr that would be invalid. This docs clarification can be backported to 3.12 and 3.11.
  2. In Python 3.13, state in the docs that using AnyStr is deprecated and that users are encouraged to use PEP-695 syntax wherever possible.
  3. In Python 3.16, remove AnyStr from typing.__all__, and start emitting a DeprecationWarning if a user does from typing import AnyStr or accesses typing.AnyStr.

    Removing it from __all__ will be a breaking change, but it's the only way to emit a DeprecationWarning for typing.AnyStr before removing it unless we're okay with emitting a DeprecationWarning any time a user does from typing import * (and I'm not).

  4. In Python 3.18, remove AnyStr from the typing module.

Thoughts?

Linked PRs

hauntsaninja commented 1 year ago

I think maybe still too aggressive... if we did deprecation warnings in the first version of Python released after 3.11 end of life, users could respond to the warning by using the recommended alternative PEP 695 syntax.

edit: a slower timeline was edited into the original post, so this comment no longer applies

AlexWaygood commented 1 year ago

I think maybe still too aggressive... if we did deprecation warnings in the first version of Python released after 3.11 end of life, users could respond to the warning by using the recommended alternative PEP 695 syntax.

Yes, that makes sense. Python 3.11 will be end-of-life in October 2027, and Python 3.16 will be released in October 2027. So, introduce the deprecation warnings in Python 3.16? Or do you think Python 3.17?

JelleZijlstra commented 1 year ago

We can perhaps afford to be more aggressive here because users who want to avoid the deprecation warning have a simple workaround that works on all versions: they can write AnyStr = TypeVar("AnyStr", str, bytes) themselves.

hauntsaninja commented 1 year ago

Probably 3.16, unless enough people actually start testing alphas and betas by then that we want to reduce friction ;-)

Yeah, I guess so. But I think I'd prefer users make one change instead of two, and if users just inline AnyStr they still have a misleadingly named object and reusable type variables. I'd also want the warning to clearly suggest the alternative and having two alternatives depending on what you support muddies the message a little.

AlexWaygood commented 1 year ago

I think I agree with @hauntsaninja. If people just replace the really-badly-named stdlib type variable with an identical really-badly-named type variable in their own code, that kinda defeats the point :)

So let's go with introducing deprecation warnings in Python 3.16, and removing it from Python in 3.18. (I've edited my original post to reflect that.)

sobolevn commented 1 year ago

Two extra points:

  1. Replace our own usages of AnyStr
  2. I also propose adding one (or more) @overload example to show how to replace AnyStr usage in def some(x: AnyStr) -> AnyStr: to keep the same type-checking behaviour
michael-the1 commented 1 year ago

I'd like to pick this up if that's okay :) Currently hacking away on it at Europython.

AlexWaygood commented 1 year ago

I'd like to pick this up if that's okay :) Currently hacking away on it at Europython.

Absolutely, go for it! Only the first two steps are actionable right now, and they should probably be done in separate PRs :)

michael-the1 commented 1 year ago

A bit early, but I already have an implementation for step 3. Similar to what is done for urllib.parse.Quoter.

def __getattr__(name):
    if name == 'AnyStr':
        import warnings
        warnings._deprecated("typing.AnyStr", message="", remove=(3, 18))
        return _AnyStr
    raise AttributeError(f'module {__name__!r} has no attribute {name!r}')
# A useful type variable with constraints.  This represents string types.
# Deprecated in favour of PEP-695 syntax
_AnyStr = TypeVar('AnyStr', bytes, str)

Should I push this as a PR too? I'm only a few years too early.

AlexWaygood commented 1 year ago

Should I push this as a PR too? I'm only a few years too early.

There's a risk that merge conflicts will come along in the meantime, so it might be a pain for you to maintain it in the intervening years... but sure!

JelleZijlstra commented 1 year ago

Should I push this as a PR too? I'm only a few years too early.

I would say no. Every open PR adds a bit of ongoing maintenance work for core developers who want to look at all open PRs. I'd rather have the PR only when it becomes relevant.

Also, that solution will make from typing import * raise a DeprecationWarning, so I don't think we can do that anyway.

AlexWaygood commented 1 year ago

Also, that solution will make from typing import * raise a DeprecationWarning, so I don't think we can do that anyway.

(In fairness to @michael-the1 there, I covered that in point (3) of my original proposal in this issue. Removing "AnyStr" from __all__ in Python 3.16 would be a backwards-incompatible change, but we'd be doing it after it had been deprecated in the docs for 3 releases, and I think it's the least-backwards-incompatible way of doing things. The alternative is to have no deprecation warnings at all prior to removing AnyStr from typing altogether.)

AlexWaygood commented 1 year ago

Okay, we've added more usage examples to the docs on py311+ and added a docs-only deprecation for Python 3.13 (thanks so much @michael-the1, I'd been procrastinating working on this!)

We're now at the "Now we wait for three years" part of the plan (don't think there's an applicable label we can add to the issue for that, sadly).

michael-the1 commented 1 year ago

@AlexWaygood Thank you for the guidance! It was a very nice first experience contributing ❤️