CarliJoy / intersection_examples

Python Typing Intersection examples
MIT License
33 stars 2 forks source link

Handle "impossible" intersections #5

Closed CarliJoy closed 9 months ago

CarliJoy commented 1 year ago

This issue is about handling an "impossible" intersection by static type checkers. We consider two cases:

It must be transformed to Never

Arguments in favor:

It is up to the type checker to reduce it or not

Arguments in favor:

(Please edit this comment. I haven't gone through the whole issue to add links to everyone's arguments.)

mikeshardmind commented 1 year ago

I don't really think this needs its own issue. We can detect some cases where an intersection is impossible to satisfy, and that should result in Never, I'm not sure why this would be something requiring any amount of bikeshedding.

CarliJoy commented 1 year ago

@mikeshardmind I already wrote an algorithm for protocols in the current specification under https://github.com/CarliJoy/intersection_examples/blob/main/specification.rst#protocol-reduction

Could you adopt this algorithm to accompany this?

There is already a section with evaluation to never, we could add things there, too.

But in this case I would like to define also, that type checkers should deliver a good error msg. Otherwise finding these issues will be pain.

mikeshardmind commented 1 year ago

I'm not in a position to do anything formal at the moment, but that said I already posted a partial list of "trivial" reductions in the discord server, I'll post them here as well.

Note, removal of Any was included prior to some people finding that controversial, don't read into it as a mandate that this approach is the resolution

Trivial Removal of Any Any & P -> P Any & P & Q -> P & Q

Subtype relationship example (copied directly from draft, already included) (Note: examples for Any above also follow this rule of more specific types taking precedence)

Intersection[str, Employee, Manager] is Intersection[str, Manager]

concrete types which fully satisfy protocols

SupportsIndex & int -> int

Reductions involving None

None & ConcreteType -> Never None & ProtocolNoneDoesntSatisfy -> Never None & ProtocolNoneSatisfies -> None (ie. bool)

Interaction with other Final types

(Note: None follows all of these rules, no special casing of None)

FT is a final type NBFT is another type that isn't in FT's bases BFT is a type that is in FT's bases SFTP is a protocol (heh) that FT satisfies NSFTP is a protocol that FT does not satisfy

FT & NBFT -> Never FT & BFT -> FT FT & NSFTP -> Never FT & SFTP -> FT


In addition to this post, there's also if you have two types with incompatible interfaces -> Never as it would be impossible to satisfy both requirements requested by use of the intersection. I don't think artificially creating overloads works, for reasons above, complexity, and so on. It also behaves more like a Union at that point, so if you want a Union, just use a Union.

CarliJoy commented 1 year ago

I already included them in the specification. Did you have look at it?

CarliJoy commented 1 year ago

In addition to this post, there's also if you have two types with incompatible interfaces -> Never as it would be impossible to satisfy both requirements requested by use of the intersection. I don't think artificially creating overloads works, for reasons above, complexity, and so on. It also behaves more like a Union at that point, so if you want a Union, just use a Union.

That's where we disagree. Creating overloads is actually one use case for intersections. I.e. think of Callable[[int], str] & Callable[[float], [str]. People already said that would expect an overload there.

Of course the special cases about staticmethods and classmethods are non trivial and need some evaluation.

mikeshardmind commented 1 year ago

I would expect this to be an error. If I wanted that, the correct tool is a Union, not an intersection.

Callable[[int], str] | Callable[[float], str] has the behavior you are putting into the intersection here by creating overloads artifically.

mikeshardmind commented 1 year ago

The specification is also stricter on allowable type combinations than what is required by python. It is possible to have multiple inheritance from multiple builtin types without it being in error. This is why my post was specific to mention type compatibility as well as if a class was marked as being final

CarliJoy commented 1 year ago

The specification is also stricter on allowable type combinations than what is required by python. It is possible to have multiple inheritance from multiple builtin types without it being in error. This is why my post was specific to mention type compatibility as well as if a class was marked as being final

That is not correct, see https://github.com/CarliJoy/intersection_examples/blob/main/test_basetypes.py I didn't find any combination that is valid.

mikeshardmind commented 1 year ago

I should have been so much more specific. There is no rule in the interpreter or at runtime that multiple inheritance from builtin classes is impossible. The specification implies this off it being true for the current types there, but one can imagine that it may not remain true. It should be sufficient, more succinct, and more permanently correct to say that A & B is not valid for concrete types if class(A, B) is not valid.

mikeshardmind commented 1 year ago

Anyhow, if you are resolving this by adding overloads, I'm no longer interested in intersections. I consider this to be antithetical to why I want an intersection in python. Everything else there's been disagreement about I could see multiple paths out of, but this would result in me just not using the feature ever. It would be more correct to just hand out protocols that specify exactly my intent at that point.

If I specify A & B where A is mine, and B is a library's, I expect that A & B are out-of-the-box not incompatible interfaces. Hiding the incompatibility with artificially inserted overloads that then have to be checked for an outcome when used breaks expectations and some of the reasons for Intersections. I want to be able to use this as if it is A and as if it is B. If I wanted to check which interface I had, I wouldn't need an intersection.

CarliJoy commented 1 year ago

Well the correct definition would be A & B is never if class (A, B) or class(B, A) raises TypeError.

There are two Problems with this definition:

  1. Performance: You need to check all permutations of all arguments
  2. It requires running the code what a type checker never should do (strongest arguemt)
  3. How should type checker not implemented in python know this? And there are even multiple implementations of python next to CPython. So maybe in one of this implementation class (int, float) maybe even works. So typing will now depend on the interpreter?
mikeshardmind commented 1 year ago

The type error here is predictable and arises from a base class layout conflict. This can be statically determined and does not require running the code.

All permutations of arguments should be checked, it shouldn't be possible to create an impossible requirement.

Realistically, though, this goes back to what was brought up in discord by @DiscordLiz that even maintainers of type checkers thought intersections on anything other than protocols would be too complex.

DiscordLiz commented 1 year ago

pragmatically speaking, it's worth just disallowing multiple inheritance of builtin types. The manner in which they are implemented would need to change for this to no longer be true, and other implementations could allow it.

you do need to verify that a combination of types is possible, which would be checking that there exists at least 1 valid permutation of bases.

Really should just have intersections be for fully compatible protocols only. It's simple, and and solves a real problem. All of these edge cases come in from things nobody should ever need to care about in practice, but for it to be defined robustly and allow more than protocols, do.

DiscordLiz commented 1 year ago

I read through the current specification, and I need to state more than just simple agreement that the overloads are a mistake. This creates a situation where intersections do something that could never be valid on real classes for protocols, and only for protocols. Protocols are already only a definition structurally, I would expect any incompatibility to be a mistake on the part of whoever requested that something works with both.

You can't hack this together for real classes, because it would become an LSP violation. This breaks symmetry between structural and nominal typing.

erictraut commented 1 year ago

A few thoughts...

Regarding overloads, these are not the same as intersections. The two should not be conflated. Overloads have complex signature matching rules and are order dependent. Intersections have simple rules based on set theory, and are order independent. I would be surprised if the word "overload" appeared anywhere within this PEP.

The subtyping rules for intersections should build upon existing subtyping rules within the type system. Type checkers already have rules in place for evaluating subtyping relationships for functions and overloads. The same is true for protocols, nominal classes, generics, etc. This PEP should not attempt to define these existing rules. It should simply explain how intersections work with respect to these existing subtyping rules. This will simplify the PEP and help ensure that it composes well with existing type features. If the PEP attempts to define all of the behaviors in terms of specific cases ("if an overload is involved..." or "if a protocol is intersected with a nominal class..." or "if a nominal class is intersected with Any"), the complexity of the spec will explode and inconsistencies and omissions will result. I therefore encourage the authors to stick to defining intersection rules that leverage existing generalized rules within the type system.

Regarding "impossible" intersections, my recommendation is to leave it up to each type checker to decide whether to detect such impossible conditions and where such conditions should be flagged as an error (e.g. at the point where the type is "spelled" as an annotation, where it's evaluated, or where it's used to evaluate type compatibility). I don't think the PEP should mandate such errors — or even express an opinion as to whether such errors should be reported. There are many reasons in Python why an intersection type may not be realizable without generating type checking or runtime errors. Some of these conditions are quite expensive to check for. Here's a partial list off the top of my head:

Type checkers today detect a subset of these conditions when evaluating a class declaration with multiple base classes, but many of these conditions are not flagged as errors. Here are some examples:

class A(int, str): ... # Runtime error, but not currently flagged as error by mypy or pyright

class B(list[str], Sequence[int]): ... # Flagged as error by pyright in strict mode, no error in mypy

class P1(Protocol):
    a: str

class P2(Protocol):
    a: int

class C(P1, P2): # Both mypy and pyright generate an error here
    ...

In practice, I think that we'll find that most of these "impossible intersection" cases occur very rarely (if ever) in real code. If some of these cases do come up more often, users will report these cases, and type checker authors can decide if it makes sense (based on tradeoffs like implementation complexity and runtime costs) to check for that specific condition. So my preference is for the PEP not to say anything about "impossible intersections" or simply state that "type checkers may (but are not obligated to) check for certain conditions where an intersection type cannot be realized at runtime".

DiscordLiz commented 1 year ago

I would be surprised if the word "overload" appeared anywhere within this PEP. [...] Some of these conditions are quite expensive to check for. Here's a partial list off the top of my head: [...] Classes that define the same attribute with conflicting types — those that result in a Never when intersected

The Current draft has specified behavior for this case which multiple people believe to be unsound. The draft specifies this results in overloads of each conflicting signature for Protocols, and Unions of types for TypedDict fields rather than being Never

NeilGirdhar commented 1 year ago

Requesting an impossible intersection should be treated as an error.

I agree that requesting an impossible intersection should be treated as an error (with each type checker deciding what that means as Eric says), but there are certain intersections that are absolutely possible.

First, in your example, you have one method that's static and one that's an ordinary method. Python typing doesn't have good support for the descriptor protocol yet so this isn't going to be testable to create a subtype method that supports both. Let's not look at these.

In general however, you can often create a subtype method that satisfy any supertype methods provided that the subtype method returns a type that's the intersection of all its superclasses. That's what I illustrated in the other thread. Disparate parameter lists can always be accommodated through overloads.

So, very few things are "impossible" except when that intersection is empty—for example for different return types that cannot be subclassed.

mikeshardmind commented 1 year ago

To what end is it worth creating a rule that this must be valid? Pragmatism and the fact that there Are so many edge cases on this that you are trying to handwave away is all I need to say this is a horrific idea. Solve the edge cases in a way that satisfies a real need, currently, this is just a pit for type checker performance, and a way for subtle incompatibilities to go unnoticed.

NeilGirdhar commented 1 year ago

To what end is it worth creating a rule that this must be valid?

Because intersections can be created synthetically in generic types. If you block all of these things, you're going to severely limit the type system.

What's the reason for blocking perfectly valid code?

mikeshardmind commented 1 year ago

To what end is it worth creating a rule that this must be valid?

Because intersections can be created synthetically in generic types. If you block all of these things, you're going to severely limit the type system.

What's the reason for blocking perfectly valid code?

Well, if you had read the whole message that's from you'd have your answer. There are edge cases that you do not have an answer for, and for which the current proposal can be shown to be unsound. It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

NeilGirdhar commented 1 year ago

There are edge cases that you do not have an answer for, and for which the current proposal can be shown to be unsound. It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

mikeshardmind commented 1 year ago

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

Okay, then leave it up to type checkers. The current spec instead mandates a synthetically created overload and this is what I took issue with, and what you attempted to demonstrate was "fine actually".

NeilGirdhar commented 1 year ago

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

Okay, then leave it up to type checkers. The current spec instead mandates a synthetically created overload and this is what I took issue with, and what you attempted to demonstrate was "fine actually".

Sorry, that's not what I was saying. What I was saying was that trying to block "incompatible interfaces" would end up blocking intersections that were actually possible, and I was presenting the overloads as an example of a possible intersection.

mikeshardmind commented 1 year ago

Sorry, that's not what I was saying. What I was saying was that trying to block "incompatible interfaces" would end up blocking intersections that were actually possible, and I was presenting the overloads as an example of a possible intersection.

Hmm. I think the mix of overloads being prescribed in the draft and showing use that didn't fit edge cases may have led to me reading that with less understanding of your point of view. I think we fully agree that compatible things should not be blocked if at all reasonably possible, and only disagreed on the specific example given being reasonably possible.

I think I largely agree with @erictraut's comment here: https://github.com/CarliJoy/intersection_examples/issues/3#issuecomment-1648843307 that merely defining the rules of the intersection should be enough. I don't think that overloads are a satisfactory answer to an intersection of two methods with conflicting signatures, but in defining it as just "fields are intersected as well", it allows anything which can be determined as satisfactory to be allowed, and doesn't require an answer to that being satisfactory to be immediately apparent

NeilGirdhar commented 1 year ago

I think we fully agree that compatible things should not be blocked if at all reasonably possible, and only disagreed on the specific example given being reasonably possible.

Fair enough. It sounded like you were suggesting an incremental approach:

It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

I don't want to go through all of these issues that will sit unsolved for years. Better to not induce them to be created in the first place.

and doesn't require an answer to that being satisfactory to be immediately apparent

Right.

mikeshardmind commented 1 year ago

Fair enough. It sounded like you were suggesting an incremental approach:

Well, to be fair, I was, but not in a way that I believed required this to become a progress blocker or extra rules to specifically block something valid. I think Eric's comments made the case far more elegantly than I did, but it is my belief that if the rules we pick are fundamentally sound, then it doesn't matter if we do not have all the answers, but if we have special cases for things, we actually do need all the answers that are relevant to those special cases, or the rules become ambiguous.

Because I was operating with the lens of the current draft specifying a specific behavior for how to solve an intersection, rather than only defining the intersection and whether something is considered a sound intersection based on general rules, I was attempting to show that the prescribed behavior for a specific case was not able to completely solve all instances of that case, and was therefore partially ambiguous, and better not to have that included as a result. Even simpler though is what Eric said, just define it correctly from base rules and let implementations handle the implementation of the definition.

gandhis1 commented 1 year ago

While I understand the need for practicality to determine what rules are checked or not, I'm not thrilled at the idea of allowing for even more areas where the various type checkers can have inconsistent errors. This freedom is beneficial for type checker maintainers but a burden to type annotation maintainers. Libraries often have to choose a single type checker to support or attempt to support both and regularly run into conflicts (as I have witnessed with pandas-stubs).

I don't want to derail this thread with a debate on that more general issue, but simply want to offer a counterpoint as to why mandating a consistent set of errors that all maintainers can agree on actually may be desirable. If something will likely end up being non-performant or tedious to implement, perhaps now is the time to refine that.

CarliJoy commented 1 year ago

@gandhis1 thanks for laying out this potential conflict. I am sure we can find a solution the fits to both target groups. A PEP that doesn't limit type checker maintainers in their implementation but is still well enough defined to ensure compatible results of different checkers.

Seeing the different opinions about how strict a type checker should be i.e. when it comes to differing function parameter definition I am also fearing we would increase the problem of type checkers giving different results. Maybe one type checker implements a strict understanding of intersection and another a flexible. A maintainer of type annotations for code that is valid python but wasn't well designed (i.e. not 100% following the LISP completely) could run in serious troubles. Even worse would be if one type checker is strict with one aspect and the other in another.

@erictraut I fully understand your concern about the final PEP becoming to complex and unsound. Still don't you think that formulating that target what should evaluate to Never could be formulated without falling into a logical/complex trap?

@mikeshardmind I am not an expert so feel free to come up changes of the draft :-)

erictraut commented 1 year ago

Specifying that an intersection type should be reduced to Never under certain circumstances is less problematic than dictating that an intersection type should be disallowed (and generate an error condition) under certain circumstances. There are many places where an intersection type may be generated within the internal bowels of a type checker (just as a union type may be generated), and it's not always possible to know how to present the user with an intelligible error message or determine which node in a parse tree to associate that error with.

Specifying that an intersection should be reduced to Never is still somewhat problematic for the following reasons:

  1. It requires normalization (i.e. distributing intersections over unions) and reduction logic to be applied any time an intersection type is generated internally to the type checker. The normalization will necessarily change the way the type is "spelled" (i.e. how it appears within an error message), so users may be confused about why the type appears different from how they typed it.
  2. Normalization and reduction are both costly, and this is on a hot path in the type checker. Neither pyright nor mypy attempt to apply any reduction to unions except for duplicate removal and the removal of Never, both of which are cheap checks. Neither mypy nor pyright attempt to eliminate subtypes within unions, for example. Doing so would affect type checking performance — significantly in some cases. If we limit reduction to cases that involve Any (as TypeScript does), that can be done efficiently because it doesn't require normalization and can be done with an O(n) algorithm. But other rules (those that involve subtype checks) are very expensive and require normalization and pairwise operations that result in O(n^2) performance (where n is the number of types in the intersection).
  3. Reducing certain intersections to Never requires us to correctly and completely identify all of the rules about which conditions cause such reduction. Changing the rules even slightly in the future will result in breaking changes to the type system. I have low confidence that we can establish a comprehensive list of all conditions that can result in intersections that are not realizable at runtime. The Python runtime is so full of exceptions and special cases that it makes this difficult. If we indicate that no reduction is performed — or that reduction is performed only in a few very specific cases (e.g. with Any), then we avoid this problem.
  4. It creates an inconsistency with unions. No reduction rules are stipulated in the typing standard for unions. In practice, type checkers do not reduce unions except for the few specific cases noted above (duplicate and Never removal).
mikeshardmind commented 1 year ago

It requires normalization (i.e. distributing intersections over unions) and reduction logic to be applied any time an intersection type is generated internally to the type checker. The normalization will necessarily change the way the type is "spelled" (i.e. how it appears within an error message), so users may be confused about why the type appears different from how they typed it.

With regard to this case, we can (and maybe even should) say that the canonical form for intersections and unions should be Disjunctive Normal Form. (there are good reasons for choosing this form in terms of thinking about it and structuring of code that would operate on these) This then gives a good way to explain to users, it would be consistent, and it wouldn't be something that changes with every little change to the type system.

I originally thought there should be cases removed for detectability as Never to help assist users in reasoning about why certain combinations don't need to be considered, but I think your points about performance, as well as all the various potential exceptions and changes over time, is a good reason not to except for perhaps the most basic cases such as T & Never, and that even in that case, we can leave choosing to simplify it to Never to type checkers to decide.

This diverges across type checkers if we don't mandate it, unacceptably.

NeilGirdhar commented 1 year ago

With regard to this case, we can (and maybe even should) say that the canonical form for intersections and unions should be Disjunctive Normal Form.

I also suggested disjunctive normal form last year along with a variety of other optimizations.

randolf-scholz commented 1 year ago

Given the example in https://github.com/CarliJoy/intersection_examples/issues/5#issue-1818919668, wouldn't it be more useful to just coerce (A&B).foo to Never, instead of A&B itself? I see several advantages:

  1. Sometimes we might want to use the intersection of 2 classes despite the incompatibility of some methods, if we only intend to use compatible methods.
    • This might especially be important when one of the classes implements __getattr__, since, strictly speaking, if A implements __getattr__ then it shadows any methods of B not implemented by A.
      • As a concrete example, consider pandas.DataFrame, which during __init__ adds column names as attributes if they don't shadow any existing attributes.
    • Also, one might consider the possibility of users fixing incompatible signatures through @override
  2. Type-checkers can still warn when incompatible classes get created, and raise errors as soon as an incompatible method gets used (since Never cannot be called)
NeilGirdhar commented 1 year ago

Given the example in #5 (comment), wouldn't it be more useful to just coerce (A&B).foo to Never, instead of A&B itself? I see several advantages:

That's exactly what I argued here. I now think it's better to leave it up to type checkers. But in general, I agree that that's probably what would end up happening in practice. Mandating that A&B must fail induces a quadratic time (minimum) check apart from other undesirable things.

mikeshardmind commented 1 year ago

Given the example in https://github.com/CarliJoy/intersection_examples/issues/5#issue-1818919668, wouldn't it be more useful to just coerce (A&B).foo to Never, instead of A&B itself? I see several advantages:

  1. Sometimes we might want to use the intersection of 2 classes despite the incompatibility of some methods, if we only intend to use compatible methods.

No. This is quite literally the use case for Union. Observe:

from typing import reveal_type

class A:
    foo: int
    bar: int

class B:
    foo: int

def show_union(x: A | B) -> None:
    reveal_type(x.foo)
    reveal_type(x.bar)
main.py:12: note: Revealed type is "builtins.int"
main.py:13: error: Item "B" of "A | B" has no attribute "bar"  [union-attr]
main.py:13: note: Revealed type is "Union[builtins.int, Any]"
Found 1 error in 1 file (checked 1 source file)

Intersection having this behavior violates the motivating cases for it, and the generally understood definition that we are specifying multiple constraints that each much apply in the case of an intersection, whereas with Union we are applying multiple constraints that one or more must apply.

I now think it's better to leave it up to type checkers.

I think the behavior of attribute access needs to be well defined, but that any early termination to Never would be entirely up to type checkers and that we can't not define this without divergent behavior between type checkers.

randolf-scholz commented 1 year ago

@mikeshardmind "No. This is quite literally the use case for Union."

When one uses A & B the expectation is that (A & B).foo = A.foo & B.foo and not A.foo | B.foo. What I am saying is that I want to be allowed to use A & B even if for some attribute bar, A.bar is incompatible with B.bar, (in the sense A.bar & B.bar = Never), because in certain situations I might only be interested in using attribute/method (A & B).foo = A.foo & B.foo. Type checkers should (imo) in most cases not coerce A&B to Never (only when it is impossible at runtime to create a subclass, e.g. due to baseclass/metaclass conflict, such as in class A(int, str) (https://github.com/CarliJoy/intersection_examples/issues/5#issuecomment-1648816964).)

I disagree that for the example given in the original post an intersection is impossible. https://github.com/CarliJoy/intersection_examples/issues/5#issue-1818919668

class A:
    foo: int
    @staticmethod
    def more_complicated() -> int: ...

class B:
    foo: str
    def more_complicated(self, something: int) -> str: ...

The following seems to be formally just fine:

class C(A, B):
    foo: Never
    def more_complicated(self, *args: Any) -> Never: ...

I would want the behavior:

c: C = C()         # ✔
c.foo              # No Return in getattr.
z: object = c.foo  # ✘ unreachable, cannot assign Never
y: int  = 0        # ✘ unreachable
c: C = C()   # ✔
c.bar()      # No Return in C.bar
y: int  = 0  # ✘ unreachable

possibly, a type-checker should even flag as soon as c.foo or c.bar() is ever used, since this would produce an instance of the bottom type, which is uninhabited.

mikeshardmind commented 1 year ago

@randolf-scholz That sounds like a case for a protocol describing the behavior you actually want instead of asking a type checker to make sense of an incompatible definition. There have been discussions in the discord about this case and making it more ergonomic with something like ProtocolFromType and Omit (Omit to explicitly discard things from the runtime type that you don't care about for the generated protocol), but you're missing the part where the interface a type has is a distinctly different thing from an instance of an object satisfying the requirement of being a type.

you can't have a real class that satisfies the type system in the example there, incompatible subclasses generate errors themselves.

NeilGirdhar commented 1 year ago

That sounds like a case for a protocol describing the behavior y

The problem with is that you don't always have control of the code producing the intersection. The intersection might be produced synthetically in a generic function or class. So, you can't simply rewrite the intersection using a protocol even if that's the ideal solution.

Second, just because the return value of the methods in the intersection of the classes has type Never, that doesn't mean that the intersection is impossible. Perfectly valid methods can return Never, for example, if they raise unconditionally. So, it's still possible for there to be a valid subclass.

DiscordLiz commented 1 year ago

@randolf-scholz https://github.com/python/typing/issues/213#issuecomment-1646681413 already explains why only viewing the types of the attributes of the object and not checking for compatibility of type isn't valid here.

I suggested ProtocolFromType in the discord to help make this case easier. I also suggested we limit Intersections to only protocols because I believe this is the only real use case for intersections. This also gets rid of the stupidity of Any. If you already have a class that validly subclasses multiple classes, you can just type hint that. I was not the first person to suggest this. @erictraut did over a year ago

mikeshardmind commented 1 year ago

Second, just because the return value of the methods in the intersection of the classes has type Never, that doesn't mean that the intersection is impossible. Perfectly valid methods can return Never, for example, if they raise unconditionally. So, it's still possible for there to be a valid subclass.

The attributes conflict. you can't subclass from both int and str, so it is impossible for a subclass to be a valid subclass of both A and B because the subclass would violate replacement for at least one of these.

Returning Never from something typed to return something specific would Also violate replacement.

NeilGirdhar commented 1 year ago

The attributes conflict

I agree with Randolph that type checkers shouldn't do anything special until the attribute is accessed. I don't think it's fine to raise any kind of type error just because the intersection has shown up.

Returning Never from something typed to return something specific would Also violate replacement.

Isn't it the bottom type? Surely, an ordinary method with the same signature would not violate replacement. And if that method raises unconditionally, that wouldn't violate replacement. So I don't see why narrowing its return to Never would violate anything new. The return type is covariant, so narrowing should always be okay.

mikeshardmind commented 1 year ago

It violates basic subtyping based replacement, ie. that instances of subclasses should be usable in the same way that instances of the parent class is.

In the intersectionless example

class A:
    foo: int
    @staticmethod
    def more_complicated() -> int: ...

class B:
    foo: str
    def more_complicated(self, something: int) -> str: ...

class Invalid(A, B):  # It's a type error  here before we even use it
    foo: Never
    def more_complicated(self, *args: Any) -> Never: ...

def example(x: A):  # but we pass in an instance of `Invalid`
    x.foo  # This should be int, but passing an instance of Invalid breaks everything. You can't *have* something here.
    x.more_complecated()
    # This should return an int and must error instead of only erroring if something unexpected happened.
    # This one might technically be allowable, but if your example to make something work
    # requires making it in fact not work, maybe we can find a better option for people
    # all type checkers currently see this as an error to define `Invalid` this way.
main.py:17: error: Signature of "more_complicated" incompatible with supertype "A"  [override]
main.py:17: note:      Superclass:
main.py:17: note:          @staticmethod
main.py:17: note:          def more_complicated() -> int
main.py:17: note:      Subclass:
main.py:17: note:          def more_complicated(self, *args: Any) -> NoReturn
randolf-scholz commented 1 year ago

@mikeshardmind I'm not a type theorist, but from my pedestrian understanding the bottom type in a sound type system satisfies 2 essential properties

  1. It is a subtype of every other type. In particular, Never ≤ (int & str)
  2. It is uninhabited, i.e. no instances of Never can be created.
    • in particular we can have no instance of int & str, because int & str = Never due to base/meta-class conflict.
    • however, we can have a function like def stop() -> Never: raise RuntimeError('no way')

Now, I hope we can agree that for example this means we should be allowed to intersect:

class A:
    def foo(self) -> int: ...
class B:
    def foo(self) -> str: ...
class C(A, B):
    def foo(self) -> Never: ...

c: A&B  = C()

because otherwise, you'd need to already disallow any class that has a Never-annotated method. You object that it is different for attributes. OK, what about properties?

class A:
    @property
    def foo(self) -> int: ...
class B:
    @property
    def foo(self) -> str: ...
class C(A, B):
    @property
    def foo(self) -> Never: ...

c: A&B  = C()

This should be ok as well, right? But properties can act as attributes, we can even add setters and deleters. What you are essentially saying is that any class X that has an attribute annotated as Never is itself equal to Never, which I just don't see why this should be true. Case in point one can write code like

from typing import Never

class A:
    x: Never
    def foo(self) -> Never: raise RuntimeError

a: A = A()  # ✔ no error

and neither mypy nor pyright will complain about it.

NeilGirdhar commented 1 year ago
    x.more_complecated()
    # This should return an int and must error instead of only erroring if something unexpected happened.

Methods in Python make no promises about exceptions. The subclass method raising unconditionally, and being annotated as such is acceptable.

that instances of subclasses should be usable in the same way that instances of the parent class is.

And even if a hypothetical child class isn't usable like its parents, it's still not necessarily a good idea to raise an error just because someone has

def f(x: A & B): ...

since such an annotation might show up synthetically, and you don't want to raise errors for synthetic code that is never going to be called.

CarliJoy commented 1 year ago

A a comment just before dropping off to my holidays.

We should make sure, that the problem that was actually the starting point of the original thread can be handled through our definition:

(Here quote copy)

This question has already been discussed […] long time ago, but now I stumble on this in a practical question: How to annotate something that subclasses two ABC's. Currently, a workaround is to introduce a "mix" class:

from typing import Iterable, Container

class IterableContainer(Iterable[int], Container[int]):
    ...

def f(x: IterableContainer) -> None: ...

class Test(IterableContainer):
    def __iter__(self): ...
    def __contains__(self, item: int) -> bool: ...

f(Test())

but mypy complains about this

error: Argument 1 of "__contains__" incompatible with supertype "Container"

But then I have found this code snippet in #18

def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None:
    if item not in thing:
        # Debug output
        for it in thing:
            print(it)

Which is exactly what I want, and it is also much cleaner than introducing an auxiliary "mix" class. Maybe then introducing Intersection is a good idea, @JukkaL is it easy to implement it in mypy?

By ilevkivskyi on May 6, 2016

The problem I see is that typing in Python was introduced later in the development and it seems that not even the standard library is typed in a way that it fulfil the typing theory in total. So if we are too strict with our definitions the Intersection won't be useful to people even if it is "correct" in theory.

EDIT: Sorry, this is won't directly be an issue with Intersection as both Protocols are compatible. I just don't get why the stdblib defined __x as object here?

class Iterable(Protocol[_T_co]):
    @abstractmethod
    def __iter__(self) -> Iterator[_T_co]: ...

@runtime_checkable
class Container(Protocol[_T_co]):
    # This is generic more on vibes than anything else
    @abstractmethod
    def __contains__(self, __x: object) -> bool: ...
randolf-scholz commented 1 year ago

@mikeshardmind

In your example

main.py:17: error: Signature of "more_complicated" incompatible with supertype "A"  [override]
main.py:17: note:      Superclass:
main.py:17: note:          @staticmethod
main.py:17: note:          def more_complicated() -> int
main.py:17: note:      Subclass:
main.py:17: note:          def more_complicated(self, *args: Any) -> NoReturn

This has more to do with special casing of @staticmethod by type-checkers. The decorator prepends the missing self argument, promising not to use it. This is a statement about runtime behavior and not types.

CarliJoy commented 1 year ago

Btw: I was not able to extract the Typing Defition of Iterable nor Container from the typedshed. Maybe someone can help me here?

randolf-scholz commented 1 year ago

@mikeshardmind By the way, I would agree that the following variant should raise an error:

class A:
    foo: int
class B:
    foo: str
class C(A,B):  # <-- no error should be raised here
    foo: Never
    def __init__(self, arg: object) -> None:
        self.foo = arg  # <--- an error should be raised on this line instead

What happens when object is replaced with Any might need discussion, but let's not do that here. Why should the error be raised at this line? Well, because it is only at this point we try to create an instance of Never, which is not allowed according to the second axiom that the bottom type is uninhabited.

randolf-scholz commented 1 year ago

Btw: I was not able to extract the Typing Defition of Iterable nor Container from the typedshed. Maybe someone can help me here?

They are in https://github.com/python/typeshed/blob/main/stdlib/typing.pyi

mikeshardmind commented 1 year ago

@randolf-scholz

Why should the error be raised at this line? Well, because it is only at this point we try to create an instance of Never, which is not allowed according to the second axiom that the bottom type is uninhabited.

Yeah, that's the main problem I have here, the secondary problem is that I think the definition of Never currently adopted is actually subtly incorrect for how it is currently treated, but we can shelve that for later, I've started drafting a PEP to amend some definitions to be more accurate to the pragmatic current use.

So, speaking of pragmatism...

What if we went about this like the protocol-only option, but with a little more flexibility in real-world use?

People have clearly stated they primarily care about the structural component (ie. how it gets used). Everyone keeps trying to hand wave the other component of satisfying this (or we have definitions that get very complicated on real-world implications in some cases that do a better job squaring the logic, but a worse job being useful IMO). What if we just... embraced that instead.

that being:

By definition at that point, satisfying the structure is satisfying the protocol, so essentially: Intersections are how it is valid to use an object, and not of the type of the object, and it must be valid for an object to be used that way when it comes to assignability.

This also squares well with older comments about what is and isn't well-defined behavior of intersections. Paired with Something like Omit if you wanted to remove fields you know would conflict because you aren't using them, and this suddenly looks like it satisfies what everyone has stated wanting from it.

randolf-scholz commented 1 year ago

Personally I'd love it if intersections were not restricted to just Protocols since that would allow to formalize requests X & Proto : give me a subclass of X that also provides the following interfaces.

Currently the only way to do this is to create an abstract base class that is a subclass of X and ask users to manually subclass this class. This is undesirable for the very same reasons protocols were introduced in the first place: https://peps.python.org/pep-0544/#rationale, it goes against the idea of duck-typing.

For example: I work a lot with pytorch and effectively every component of a neural network has to be a torch.nn.Module. Certain components might have to provide certain additional functionality, for instance I might want to specify an "encoder" component that is an nn.Module with both an encode and a decode method. Ideally, I'd want to write a protocol Encoder and then type hint such components as nn.Module & Encoder.

So at the least I'd love to see intersections of arbitrary types with Protocols in order to specify mixin functionality. The need for intersections of arbitrary types with arbitrary types is probably less common.