python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.58k stars 234 forks source link

Introduce an Intersection #213

Open ilevkivskyi opened 8 years ago

ilevkivskyi commented 8 years ago

This question has already been discussed in #18 long time ago, but now I stumble on this in a practical question: How to annotate something that subclasses two ABC's. Currently, a workaround is to introduce a "mix" class:

from typing import Iterable, Container

class IterableContainer(Iterable[int], Container[int]):
    ...

def f(x: IterableContainer) -> None: ...

class Test(IterableContainer):
    def __iter__(self): ...
    def __contains__(self, item: int) -> bool: ...

f(Test())

but mypy complains about this

error: Argument 1 of "__contains__" incompatible with supertype "Container"

But then I have found this code snippet in #18

def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None:
    if item not in thing:
        # Debug output
        for it in thing:
            print(it)

Which is exactly what I want, and it is also much cleaner than introducing an auxiliary "mix" class. Maybe then introducing Intersection is a good idea, @JukkaL is it easy to implement it in mypy?

fcoclavero commented 3 years ago

Just found myself looking for this. Any updates?

JelleZijlstra commented 3 years ago

I'm not aware of any active plans in this area. If someone wants to see this feature done, they should work on drafting a PEP and a reference implementation in one of the major type checkers.

thibaut-st commented 3 years ago

I would love that feature too, I really hope someone is working on it. But this is the only place where I see it discussed so I doubt that.

My use case is about dynamic classes created with type(). I don't want to create ~20 Mixin.

Maybe it's not the place to ask, but does anyone know why it's not in the python typing module since the beginning? It look like as basic to me as the Union type.

joelberkeley commented 3 years ago

@thibaut-st have a read of the original issue https://github.com/python/typing/issues/18

mwerezak commented 3 years ago

I would be happy with Protocols as a solution also, but right now they don't work since all bases of a Protocol have to also be a Protocol.

antonagestam commented 3 years ago

@mwerezak I agree, intersections that are limited to e.g. only protocols, or one concrete type + n protocols would be useful in my opinion.

NeilGirdhar commented 3 years ago

Protocols don't solve the general problem, which is that mypy needs to keep track of intersection types internally. For example, assert isinstance needs to intersect the type--not set it. There are many examples where setting it causes problems.

srittau commented 3 years ago

Intersection types could be useful for fast ad-hoc protocols, especially IO protocols:

def foo(file: HasRead & HasSeek) -> None:
    pass
thibaut-st commented 3 years ago

I see that this issue is still open. I'm not really aware about how the implementation of features in python is selected (or where to check what's planned), does someone know if the development is in the pipe?

I still have situation where it would be useful now and then.

JelleZijlstra commented 3 years ago

As I wrote above, I don't think there are any active plans here. If you want to see it forward, I'd suggest you write an implementation for your type checker of choice and a PEP.

I may add an implementation to https://github.com/quora/pyanalyze in the near future, though.

thibaut-st commented 3 years ago

I'd suggest you write an implementation for your type checker of choice and a PEP

I would love too, but I think it's beyond my capacity of development.

sobolevn commented 3 years ago

We would also benefit from Intersection[] type in typed-django / django-stubs. We need to model Manager.from_queryset and Queryset.from_manager methods. Right now we have a hacky Intersection[] ad-hoc implementation just for this case: https://github.com/typeddjango/django-stubs/blob/8f97bf880d3022772581ea4cf8b5bf5297a27bad/mypy_django_plugin/transformers/managers.py#L8

KotlinIsland commented 3 years ago

You can create arbitrary Intersections of Protocols.

class Foo(Protocol):
    def foo(self) -> None: ...
class Bar(Protocol):
    def bar(self) -> None: ...
class Impl:
    def foo(self) -> None: ...
    def bar(self) -> None: ...

class FooBar(Protocol, Foo, Bar): ...

def foo(it: FooBar):
    print(it)

foo(Impl()) # correctly typechecked

If only there was a way to specify that an instance of the protocol must have a set of bases:

class Foo: ...
class Bar: ...
class FooBar(Protocol, bases=(Foo, Bar)): # the set of bases that instances of this protocol must have in order to be considered a 'FooBar'
    ...
thibaut-st commented 3 years ago

But how neater and easier would it be to simply have to do something like that?

class Foo:
    def foo(self):
        pass

class Bar:
    def bar(self):
        pass

class FooBar1(Foo, Bar):
    def far(self):
        pass

class FooBar2(Foo, Bar):
    def boo(self):
        pass

def foobar_func(foobar: Foo & Bar):  # or (foobar: Intersect[Foo, Bar])
    foobar.foo()
    foobar.bar()

But I'll wait patiently to see if someone better than me can come with a convincing PEP. After all it's not really an issue, more like a nice to have.

joelberkeley commented 3 years ago

@KotlinIsland I don't follow. A protocol, by its very nature, doesn't demand nominal structure: it doesn't say which bases are required

@thibaut-st it can be an issue: if Foo and Bar aren't protocols and imported from a third-party library, I don't believe there's a decent workaround

KotlinIsland commented 3 years ago

@joelberkeley yeah I get that, and the issue here is intersections which would be better served with a dedicated type. Buuut if a protocol could define required bases you could create mixed nominal/structural types(although I don't know how useful that would be):

class Foo:
  def foo() -> None: ...
class FooAndMore(Protocol, bases=(Foo,)):
  def more() -> None: ...

class Impl(Foo):
  def more() -> None:
    pass

I'm sure at some point mixed structural/nominal types will be needed, this seems like a good way of doing it to me.

Or maybe just:

class Foo:
  def foo() -> None: ...

class More(Protocol):
  def more() -> None: ...

class Impl(Foo):
  def more() -> None:
    pass

FooAndMore = Foo & More
def eggs(f: Foo & More)
JoaRiski commented 3 years ago

Would have needed intersections today (and a couple times in the past), and now had to leave some parts untyped.

mwerezak commented 3 years ago

Yeah that seems to be the primary impact of the lack of an intersection type. There's types that simply cannot be annotated without it.

On Sun, Sep 5, 2021, 06:52 Joa Riski, @.***> wrote:

Would have needed intersections today (and a couple times in the past), and now had to leave some parts untyped.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/python/typing/issues/213#issuecomment-913129599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3F7GKS4GKTIBYUF3VORTUANDWFANCNFSM4CDC4G4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

JoaRiski commented 3 years ago

In my case, I have a decorator that modifies a class type that's passed into it by attaching extra metadata. As far as I know, there simply isn't a way to type that decorator properly at the moment. So now all classes using that decorator need to explicitly declare the extra metadata which the decorator will provide if they want mypy to be aware of those fields existing.

ruancomelli commented 2 years ago

What should be the expected behavior in the case of conflicting types? For example:

class A(Protocol):
    def foo(self) -> int: ...

class B(Protocol):
    def foo(self) -> str: ...

def call_foo(x: A & B):
    return x.foo() # oh no, x.foo() must be both an `int` and a `str`!
mwchase commented 2 years ago

Presumably, it would be int & str, which doesn't work (at least at runtime), but other base classes should fare better when intersected.

thibaut-st commented 2 years ago

What should be the expected behavior in the case of conflicting types? For example:

class A(Protocol):
    def foo(self) -> int: ...

class B(Protocol):
    def foo(self) -> str: ...

def call_foo(x: A & B):
    return x.foo() # oh no, x.foo() must be both an `int` and a `str`!

I don't really see the issue, as it's the same case as multiple inheritance with the same method name. Personnally I would use the same behavior, the first type have the precedence. (in this example, x.foo() type in call_foo(x: A & B) should be int)

antonagestam commented 2 years ago

@thibaut-st I think that would break the Liskov Substitution Principle. An object of type A & B should be expected to behave both as an A and as a B. The only way to fulfill that would be for foo to return int & str.

For the same reason mypy gives a type error if you try to subclass A and B. See https://mypy-play.net/?mypy=latest&python=3.10&gist=23c0fe765069a25d7f6c4905483ab351

reinhrst commented 2 years ago

UPDATE:

It's actually been too long since I thought about this.... The reply below doesn't make sense at all. An Interection as discussed here, is an intersection like in PEP483:

Intersection[t1, t2, ...]. Types that are subtype of each of t1, etc are subtypes of this. (Compare to Union, which has at least one instead of each in its definition.)

So not a set intersection at all.

In this case I agree with @antonagestam -- it should just be disallowed if the things intersecting are not compatible, like in (multiple) inheritance.

Original (incorrect, for historical accuracy only):

I would make it work like in set theory:

A is the set {foo(self) -> int} and B the set {foo(self) -> str} (definitely simplifying here). The intersection is the empty set (so equal to class C(Protocol): pass). In this case probably you could want mypy to give a warning, however if both A and B share another method or property, that should be kept.

class A(Protocol):
    x: int
    def foo(self) -> int: ...

class B(Protocol):
    x: int
    def foo(self) -> str: ...

class B(Protocol):
    x: int

So in this case A & B will be equal to C (for typing purposes).

I do feel this opens the door for many more corner cases.....

class Parent: ....

class Child(Parent): ....

class A(Protocol):
    def foo(self) -> Parent: ...
    def baz(self) -> Optional[int]: ....
    def bar(self, *args) -> int: ....

class B(Protocol):
    def foo(self) -> Child: ...
    def baz(self) -> int: ....
    def bar(self, x: int) -> int: ....

I do feel there are probably good answers for all of them if we think about them enough.

antonagestam commented 2 years ago

@reinhrst I agree, it probably makes a whole lot more sense to just forbid intersections of incompatible types, instead of expecting a return type of int & str. It would be interesting to compare this with other languages. What does TypeScript do for instance?

KotlinIsland commented 2 years ago

@reinhrst I agree, it probably makes a whole lot more sense to just forbid intersections of incompatible types, instead of expecting a return type of int & str. It would be interesting to compare this with other languages. What does TypeScript do for instance?

type StrNum = string & number

type IsItNever = StrNum extends never ? true : false

It becomes never, in this example the IsItNever type would be true if that type is never(NoReturn in python land), and it is true.

KotlinIsland commented 2 years ago

Although I think whats happening there is very different to Python. In TypeScript, string and number are incompatible(and therefore become never) from a type standpoint because they are effectively final. In python there is theoretically nothing stopping a type being both of those types:

class StrInt(str, int):
    pass

This specific type is invalid as the definitions in str and int are incompatible(Definition of "__gt__" in base class "str" is incompatible with definition in base class "int" etc) and will fail at runtime due to a layout incompatibility(TypeError: multiple bases have instance lay-out conflict).

Due to that fact I would still expect str & int to become NoReturn but not for the same reasons as in TypeScript.

antonagestam commented 2 years ago

In python there is theoretically nothing stopping a type being both of those types

I thought so too, but that actually fails at runtime:

>>> class StrInt(str, int): ...
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: multiple bases have instance lay-out conflict
intgr commented 2 years ago

Note that there actually is an overlap between Python's int and str objects: they both inherit from the object base class.

I think accessing __doc__ and __str__ of an int & str type should be allowed -- so maybe the type checker should only error when using attributes that have conflicting definitions, like __gt__?

However, if there are conflicting attributes, casting int & str back to say str would erase those conflicts. So maybe a downcast from Intersection back to a component type also needs to ensure there are no conflicting attributes?

KotlinIsland commented 2 years ago

@antonagestam And at type time, I talked to that in that comment.

antonagestam commented 2 years ago

@KotlinIsland Sorry about that, have to blame it on the current day of the week ;)

@intgr But, the whole point of having intersection types is to be able to do this:

def takes_a(val: A): ...
def takes_b(val: B): ...

def takes_ab(val: A & B):
    takes_a(val)
    takes_b(val)

As long as A and B are compatible, that needs to be completely valid code in order for intersections to be useful. There shouldn't have to be any narrowing or casting in any of the involved functions in that example. So if the type checker knows that A and B aren't compatible, it doesn't really make sense to not error in the definition of takes_ab. And since takes_a and takes_b can be used individually with instances of A and B, it doesn't make sense for there to be a type error within those definitions.

thibaut-st commented 2 years ago

@thibaut-st I think that would break the Liskov Substitution Principle. An object of type A & B should be expected to behave both as an A and as a B. The only way to fulfill that would be for foo to return int & str.

For the same reason mypy gives a type error if you try to subclass A and B. See https://mypy-play.net/?mypy=latest&python=3.10&gist=23c0fe765069a25d7f6c4905483ab351

Yes, but you can use python and typing without Mypy, and in python you absolutely can subclass the example. But it was just the first thing popping to my mind, I guess there is better solutions (or if not better, alternative ones).

ruancomelli commented 2 years ago

My intuition tells me that intersection types should behave the same way as if we were to create a new class/protocol using @overloads. For instance, in the next example, C should be equivalent to A & B:

class A(Protocol):
    age: int
    verbose: int
    element: Proto1 # `Proto1` is an arbitrary protocol

    def foo(self, x: int) -> float: ...

class B(Protocol):
    age: int
    verbose: bool
    element: Proto2 # `Proto2` is an arbitrary protocol
    name: str

    def foo(self, x: str) -> None: ...

    def g(self, y: int) -> int: ...

class C(Protocol):
    age: int # `age` is found in both `A` and `B` with type `int`; nothing wrong here
    verbose: bool # the intersection between `bool` and `int` is `bool`
    element: Proto1 & Proto2 # for arbitrary member variables, we calculate their intersection type
    name: str # since `name` is found in `B`, it is required here

    # any subtype of `A & B` must support the following overloads:
    @overload
    def foo(self, x: int) -> float: ...

    @overload
    def foo(self, x: str) -> None: ...

    # `g` is found in `B`, so it must be present here as well:
    def g(self, y: int) -> int: ...

So far so good, I think we all agree here, right? Now the question is what to do whenever we find "incompatible" overloads, as in:

class A(Protocol):
    def foo(self) -> int: ...

    def bar(self, x: int) -> bool: ...

class B(Protocol):
    def foo(self) -> str: ...

    def bar(self, x: bool) -> int: ...

My opinion is that an error should be raised here, as happens when you try to @overload foo and bar with the signatures shown above.

Regarding the int & str example, I believe that this should be equivalent to trying to subclass both of them with class IntStr(int, str). Since IntStr(int, str) fails, int & str should fail too.

NeilGirdhar commented 2 years ago

@ruancomelli If subclassing is the appropriate metaphor, then AB(A, B) will choose methods from A first, right? In that case, another option is for A & B to be an ordered operator that chooses methods from A in case of conflict.

That said, the error in case of conflict idea makes sense too.

antonagestam commented 2 years ago

In that case, another option is for A & B to be an ordered operator that chooses methods from A in case of conflict.

Well, that makes A & B incompatible with B, and my simple example above would fail:

def takes_a(val: A): ...
def takes_b(val: B):
    # Since `B.foo()` returns `int` we should expect `ret` to be `int` here, but because we allowed
    # incompatible types, e.g. AB(A, B), it will be str.
    ret = val.foo()

def takes_ab(val: A & B):
    takes_a(val)
    takes_b(val)
NeilGirdhar commented 2 years ago

@antonagestam You're right. The error makes more sense then. (And therefore it's not as simple as relating it to inheritance.)

ruancomelli commented 2 years ago

Well, perhaps my example worked against me... It seems more than intuitive that A & B should be exactly the same as B & A, in which case an ordered operator is unsuitable. Intuitively we should have issubtype(X, A & B) iff issubtype(X, A) and issubtype(X, B). For concrete classes A and B, this would also be equivalent to issubclass(X, A) and issubclass(X, B). All of this seems to be incompatible with the way multiple inheritance works in Python.

Perhaps we are mixing two concepts here, each one requiring its own operator?

For now, I would focus only on the Intersection operator; the InheritsFrom looks a bit more niche and error-prone.

KotlinIsland commented 2 years ago
class A:
    def foo(self, a: Foo) -> Baz: ...

class B:
    def foo(self, b: Bar) -> Qux: ...

ab: A & B
reveal_type(ab.foo)  # Callable[[Foo & Bar], Baz | Qux]

I would think that when two classes are intersected, their methods would merge and the input parameters would become intersections, and their outputs would become unions.

This is assuming that the type operator & is commutative and not ordered.

henribru commented 2 years ago
class A:
    def foo(self, a: Foo) -> Baz: ...

class B:
    def foo(self, b: Bar) -> Qux: ...

ab: A & B
reveal_type(ab.foo)  # Callable[[Foo & Bar], Baz | Qux]

I would think that when two classes are intersected, their methods would merge and the input parameters would become intersections, and their outputs would become unions.

This is assuming that the type operator & is commutative and not ordered.

This breaks LSP. An A & B should be useable as an A, but Callable[[Foo & Bar], Baz | Qux] isn't compatible with Callable[[Foo], Baz]

thibaut-st commented 2 years ago

Well, perhaps my example worked against me... It seems more than intuitive that A & B should be exactly the same as B & A, in which case an ordered operator is unsuitable. Intuitively we should have issubtype(X, A & B) iff issubtype(X, A) and issubtype(X, B). For concrete classes A and B, this would also be equivalent to issubclass(X, A) and issubclass(X, B). All of this seems to be incompatible with the way multiple inheritance works in Python.

Perhaps we are mixing two concepts here, each one requiring its own operator?

* an `Intersection[A, B]` (or `A & B`) that would require `A & B` to be replaceable by both `A` and `B` everywhere. This operator would satisfy @antonagestam's example's requirements, but would raise a type error for e.g. `int & str` since they are incompatible with each other.

* an `InheritsFrom[A, B]` (please choose a better name) that would mean "anything that derives from `A` and `B`, in that order". So `AB = InheritsFrom[A, B]` would be equivalent to subclassing `class AB(A, B)`, considering method overriding and everything.

For now, I would focus only on the Intersection operator; the InheritsFrom looks a bit more niche and error-prone.

I'm sure I'm missing something, but I can't figure out a way something would match A & B without being an inheriting class of (A, B). (and so, why intersection and inheritsFrom would differ)

KotlinIsland commented 2 years ago

this breaks LSP

Oh right, my bad. The methods would join as overloads.

class A:
    def foo(self, a: Foo) -> Baz: ...

class B:
    def foo(self, b: Bar) -> Qux: ...

ab: A & B
reveal_type(ab.foo)  # overloaded method: (Foo) -> Baz and (Bar) -> Qux
NeilGirdhar commented 2 years ago

@KotlinIsland I think you're mistaken. They cannot be overloads since that violates LSP. And I think LSP applies because A & B needs to be usable as an A or a B.

NeilGirdhar commented 2 years ago

Thinking about this a bit more, should this be allowed?

class A: pass
class B: pass
class AB(A, B): pass

class X:
    def f(self, x: A) -> A:
        ...
class Y:
    def f(self, x: B) -> B:
        ...

class XY(X, Y):
    ...

def f(x_and_y: X & Y, ab: A & B) -> A & B:
    return x_and_y.f(ab)

f(XY(), AB())  # okay.

In other words, intersecting two classes intersects all the methods, which means takes the intersection of all their parameters and return values. In this way, an A & B is usable as an A or a B.

vnmabus commented 2 years ago

Not really. Someone mentioned before that methods join as overloads, which is more accurate and less restrictive. Only if all the method parameters are equal, the return type should be the intersection, IMHO.

ruancomelli commented 2 years ago

I'm sure I'm missing something, but I can't figure out a way something would match A & B without being an inheriting class of (A, B)

@thibaut-st if A or B (or both) are protocols or typing constructs, you won't necessarily be able to subclass them (take A = Union[int, float], for instance). But yes, for concrete types A and B, I believe that A & B must inherit from both A and B, even if indirectly.

and so, why intersection and inheritsFrom would differ

Disclaimer: InheritsFrom is just a draft idea, I don't know if it makes sense to have this. But the difference is that, given

class A:
    def foo(self, x: int) -> None: ...

class B:
    def foo(self, x: str) -> None: ...

class C:
    def foo(self, x: int) -> str: ...
NeilGirdhar commented 2 years ago

@ruancomelli Is there a realistic use case of InheritsFrom? I think it's a good thought exercise, but most people who want the feature in this thread want Intersection.

NeilGirdhar commented 2 years ago

Not really. Someone mentioned before that methods join as overloads, which is more accurate and less restrictive. Only if all the method parameters are equal, the return type should be the intersection, IMHO.

Methods don't join as overloads for an intersection since that's an LSP violation.

vnmabus commented 2 years ago

Methods don't join as overloads for an intersection since that's an LSP violation.

Care to elaborate? Expanding the accepted signatures should not break LSP. Do you have a particular example in mind?

NeilGirdhar commented 2 years ago

@vnmabus What about this?

class A:
    def foo(self, a: Foo) -> Baz: ...

class B:
    def foo(self, a: Foo) -> Qux: ...

ab: A & B
reveal_type(ab.foo(foo))  # ??

If it were an A, it would promise a reveal of Baz, if it were a B, it would promise Qux. I think it should promise Baz & Qux, whereas you're suggesting it should be Baz | Qux. (Edit: we both agree, and I misunderstood.)

vnmabus commented 2 years ago

I took that into account:

Only if all the method parameters are equal, the return type should be the intersection, IMHO.