Closed erictraut closed 5 months ago
To get some more confidence on the compatibility issue, could we first change typeshed to align it with the proposed spec? That is, make a PR for the change you are alluding to, and ensure it passes typeshed's CI and can be merged without major disruption. If that exposes any problems in existing type checkers, we should consider whether to change the spec. If everything is fine, we can accept this proposal.
Yes, that's a good suggestion. I did a variation of that experiment in the pyright repo, where I modified pyright's vendored copy of typeshed. I can use that as the basis for an official typeshed PR.
@JelleZijlstra, here's the typeshed PR. I'm hitting what appears to be a bug in stubtest where it's not able to deal with one of the more complex enum classes. The mypy_primer results are clean, which is encouraging.
Yeah, looks like stubtest doesn't handle that correctly, logic I had for enums was simplistic. I filed https://github.com/python/mypy/issues/16806 , I recommend adding ssl.Purpose to the stubtest allowlist for now.
I guess I noticed a different error in the stubtest output, submitted python/mypy#16807 to fix it.
I would like to hold off on accepting this section into the spec until we're able to merge python/typeshed#11299. I think it's important for typeshed to be compliant with the spec and it looks like that's not quite possible yet.
This chapter has been stuck in limbo for quite some time. I think the holdup is mainly about one section. I wonder if I should just remove that section and try to get the remainder approved. We could then go back and try to add the remaining section at a later time.
If the main blocker is mypy's stubtest functionality, is that going to be addressed in the soon-to-be-released mypy 1.10 release? Has it already been fixed?
@JelleZijlstra, any thoughts or advice?
If we're referring to https://github.com/python/mypy/pull/16807, that should already be released in 1.9. If there's something else, let me know — while I don't do mypy releases in general, I'm happy to make point releases with cherry picks to unblock typeshed / council work
For me personally, the main blockers are (a) time, and (b) my personal dissatisfaction with the enum
module -- I find its semantics too convoluted and too reliant on metaclasses.
We're now almost ready to merge the typeshed change.
The backward compatibility issue gives me pause, though. It's good that we're able to update typeshed to conform with the proposed spec, but I'm sure there are a lot of other existing stubs written under the assumption that x: int
creates an enum member. To recapitulate, the problem is with code like this:
import enum
class Sig(enum.Enum):
ABRT: int
reveal_type(Sig.ABRT) # mypy: Literal[Sig.ABRT]; pyright: int
With this spec change, we're blessing pyright's behavior and telling mypy to make a change.
@erictraut has pyright always had this behavior, or was it a recent change? And have you received any bug reports complaining about this behavior? (A cursory search on the pyright issue tracker didn't turn up any.) If there haven't been too many complaints, that's an indication that this will be a relatively easy change for mypy to make.
A few of the behaviors proposed here are rather subtle (e.g. the _value_
trick). It would be good to add a section in the typing user guide describing those. I share @gvanrossum's feeling that the enum module is too complicated and has too many features; unfortunately that's not something we can address here.
reveal_type(Sig.ABRT) # mypy: Literal[Sig.ABRT]; pyright: int
At runtime it's a NameError
, and list(Sig)
doesn't include such attributes. So I'd say the mypy answer is definitely wrong, right? So why would anyone rely on mypy's answer?
Usually it'd be written this way in a stub. For example, typeshed's signal.pyi
currently has:
class Signals(IntEnum):
SIGABRT: int
However, both mypy and pyright currently agree that Signals.SIGABRT
is an enum member. Maybe pyright special-cases stubs right now?
Has pyright always had this behavior, or was it a recent change?
Pyright has had this behavior for about three years now. As you guessed, pyright acts this way only for ".py" files. For stub files, its mirrors mypy's behavior. This was necessary because of the way typeshed defines enums. If we stick with mypy's behavior, we have no way of properly typing enums that include both members and non-member attributes. I guess we could say "we don't care about those cases", but I'm uncomfortable taking that stance because I've encountered many cases in real-world code that would be affected by this limitation.
Pyright doesn't currently implement the _value_
trick. That was an idea I came up with when writing this chapter. I'm OK dropping that concept if we think it's too subtle or unnecessary.
I agree that the enum module is too complicated and has too many features. More of these features are added every release, and I'm surprised each time about them — always after the fact. It feels like there's some missing process here. Who is adding these? Shouldn't these changes be going through some open review/spec process?
I understand your concerns about backward compatibility. As with all of the typing spec clarifications, my goal is to work toward a good long-term solution even if that means a small amount of short-term pain. The proposal under review is my best attempt to come up with such a solution. I'm open to other suggestions if you can think of a way that we can land in a good place long term and avoid short-term compatibility issues.
I think I'm in support of breaking backward compatibility here. Things should fine in non-stubs, which limits how bad things can get.
I think member = cast(type, ...)
could be a better alternative to the _value_
trick? If so, it would deal with the heterogeneous members case @AlexWaygood was concerned about.
For stubs, typeshed should be a big chunk of usage. We should update stubgen to output the new enums. It's unlikely for extension module stubs to use enums (maybe other than some PyO3 use case). It would be useful for us to know what the most downloaded non-typeshed stubs packages on PyPI are, and we can work with them to see if they're affected. I scrolled through a few pages of code search of stubs containing enum and many of them use plain assignment.
If we stick with mypy's behavior, we have no way of properly typing enums that include both members and non-member attributes.
The way we've always done this in typeshed in the past has been to "pretend" that the non-member attributes are properties, e.g. https://github.com/python/typeshed/blob/7c8e82fe483a40ec4cb0a2505cfdb0f3e7cc81d9/stdlib/http/__init__.pyi#L14. This isn't fully accurate w.r.t. how enums work at runtime, but as far as I know we've never received any issues or informal complaints about the way we've always done this up to now. I still struggle to understand the motivation for these changes to how stubs are meant to work, given the backwards-compatibility implications, and the fact that it would mean heterogeneous enums would be inexpressible in stubs in some instances. ISTM we'd just be trading one kind of lack of expressiveness for another, which doesn't seem to me to be worth the churn.
I've updated stubgen in https://github.com/python/mypy/pull/17125 , like the typeshed PR, it seems reasonable to do either way.
@AlexWaygood what did you think of my member = cast(type, ...)
suggestion for the case where stubs can't inline the value? Maybe we don't have to trade lack of expressiveness.
(re motivation: it does appear to me that choosing "we don't care about the distinction between members and non-members" would be sad but probably viable, since from a quick skim I was only able to find two mypy issues. But the proposed change does match the behaviour at runtime more closely and I'd still like to be able to express those things)
The cast()
idea is okay; I could get behind that.
Alternative idea, slightly more blue-sky: Python 3.11 added the enum.member
and enum.nonmember
decorators. I wonder whether we could repurpose those so that they also work as special forms in annotations, e.g.
class Foo(Enum):
x: member[int] # explicitly declare an enum member
y: nonmember[str] # explicitly declare an enum non-member
We'd be special-casing a symbol that comes from outside the typing
module for use in annotations, but there's precedent for that in e.g. dataclasses.InitVar
. It seems to me like it would be pretty elegant, and would be fully expressive. We could backport the symbols in typing_extensions
. And it would be backwards-compatible with the current ways of doing things, as we wouldn't disallow the existing way of defining enums that most stubs currently use; we'd just be adding a new, more expressive way.
I signed off.
As mentioned above, I think the backwards compatibility break is quite feasible here. I merged my stubgen PR, so both typeshed and stubgen should be updated for it.
The _value_
trick for stubs is a little subtle and I'd personally be happy to drop it. I'd prefer to use member = cast(type, ...)
, since a) it lets you express more things, b) it should work "for free" (and I would advocate we do this if it comes up in typeshed)
I read the chapter. Left a few comments/questions, but I'm OK with its inclusion as-is.
I feel like the _value_
"trick" should also come for free: enum instances do have a _value_
attribute, type checkers already have to understand that the value assigned a member is assigned to that instance attribute, and _value_: int
is the normal way to specify the type of an instance attribute. I would rather read (or write) a stub with _value_: int
than one that uses RED = cast(int, ...)
on every member definition.
(I agree the _value_
trick is a trick that makes sense. What I meant by "for free" was that neither users nor type checkers need to know anything more about enum to get the semantics Alex wanted to be able to express. I believe at the original time of posting, neither myself nor mypy nor pyright understood _value_
enough)
I agree that the enum module is too complicated and has too many features.
I disagree, the enum module is excellent and comes with a minimal feature set corresponding to the most commonly requested use cases on Stack Overflow. It's sound and if anything the typing tools have never caught up with even the most basic functionalities of the module.
Reading the spec that's being defined here I get a feeling it would have been adequate back in 2017 but in 2024 it's just lagging behind of most real world uses enum has.
reveal_type(Sig.ABRT) # mypy: Literal[Sig.ABRT]; pyright: int
pyright is wrong. It should obviously be:
reveal_type(Sig.ABRT) # mypy: Literal[Sig.ABRT] reveal_type(Sig.ABRT.value) # mypy: int
Enum members are themselves types and instances of the defining class, so conflating the type of the value with the type of the member is by definition nonsense.
The example you're quoting is specifically about the behaviour in stubs. At runtime there will always be values associated with enum members.
I recommend reading the content of the spec and looking at the test cases in the conformance test suite. If after doing so you still have concerns you'd like to raise, I recommend doing so at discuss.python.org — this issue isn't really the right location to do so.
I'd like to request that the TC consider adoption of a new chapter in the typing spec that spells out type checking behaviors related to enum classes.
Links to PR & Discussion The PR can be found here. The latest draft incorporates feedback from PR reviews and the discussion.
The discussion can be found here.
TC Sign-off
Current Type Checker Behaviors The proposed chapter specifies several required behaviors for type checkers along with some optional behaviors.
The latest published version of pyright (1.1.347) has about eight areas where it fails to conform with the required behaviors. Version 1.1.348 of pyright will address all of these.
The latest published version of mypy (1.8.0) mostly conforms to the proposed spec, but it doesn't yet implement support for
enum.member
andenum.nonmember
, which were introduced in Python 3.11. It also doesn't enforce naming consistency when the functional form ofEnum
is used. And it doesn't enforce the type of the_value_
attribute in all cases if_value_
has an annotated type.I haven't done a detailed analysis of pyre or pytype, but I assume that they largely conform with the required parts of the proposed spec.
Controversial Issues There is one potentially controversial issue raised in the proposed spec. It has to do with the way enum members and non-member attributes are delineated in a type stub. Historically, there has been no unambiguous way to differentiate between enum members and non-member attributes within a type stub.
Typeshed stubs (and presumably stubs outside of typeshed) are inconsistent in how they define enums.
The proposed wording clears up this ambiguity and indicates that stubs should follow the same rules as non-stub files for differentiating between enum members and non-member attributes, but it allows the member value to be replaced with a
...
.Fixing all of typeshed's stubs to conform with this updated spec will be relatively easy. I was able to do it in about 10 minutes, and the resulting changes are backward compatible with existing type checker implementations. However, once type checkers are updated to conform to the proposed spec, they will interpret older stubs differently than they do today. This could result in some temporary pain for users.
I don't have a good solution to this backward compatibility issue other than to recommend that type checkers may want to hold off on that part of the change while type stubs are updated.