Introduce an Intersection

ilevkivskyi commented 8 years ago

This question has already been discussed in #18 long time ago, but now I stumble on this in a practical question: How to annotate something that subclasses two ABC's. Currently, a workaround is to introduce a "mix" class:

from typing import Iterable, Container

class IterableContainer(Iterable[int], Container[int]):
    ...

def f(x: IterableContainer) -> None: ...

class Test(IterableContainer):
    def __iter__(self): ...
    def __contains__(self, item: int) -> bool: ...

f(Test())

but mypy complains about this

error: Argument 1 of "__contains__" incompatible with supertype "Container"

But then I have found this code snippet in #18

def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None:
    if item not in thing:
        # Debug output
        for it in thing:
            print(it)

Which is exactly what I want, and it is also much cleaner than introducing an auxiliary "mix" class. Maybe then introducing Intersection is a good idea, @JukkaL is it easy to implement it in mypy?

NeilGirdhar commented 3 years ago

Only if all the method parameters are equal, the return type should be the intersection, IMHO.

Sorry, I missed that. That's an even more complicated rule then: the combination of all overloads such that you have an intersection of return types when combining overloads with equal parameters. I don't see anything wrong with it. Maybe someone else can help clarify the situation.

antonagestam commented 3 years ago

@vnmabus I'm having a hard time wrapping my head around how that would work for a more complex case, like this:

class A:
    def foo(self, a: Foo) -> int: ...

class B(A):
    # It's OK for B to take a broader argument than A, it's still a valid A.
    def foo(self, a: Foo | Bar) -> int: ...

class C:
    def foo(self, a: Bar) -> str: ...

def takes_ac(val: A & C):
    bar: Bar
    # Wouldn't this now be typed as str, even though a subtype of A also takes Bar?
    val.foo(bar)  

bc: B & C
takes_ac(bc)

Put in a more generic way: since subtypes can take broader arguments than their supertypes, how could the generated overloads possibly be made complete?

vnmabus commented 3 years ago

So, the return type of foo(bar) for A & C is of course str. The return type of foo(bar) for B & C is str & int. I do not see any contradiction.

antonagestam commented 3 years ago

@vnmabus Are you saying it would be a type error to call takes_ac with B & C? E.g. that B & C is not a subtype of A & C?

vnmabus commented 3 years ago

@vnmabus Are you saying it would be a type error to call takes_ac with B & C? E.g. that B & C is not a subtype of A & C?

No, why? The return value of foo(bar) for B & C is compatible with the return value for A & C (it is narrower). Am I missing something?

antonagestam commented 3 years ago

@vnmabus The problem is that by only inspecting the involved types A and C the return type of val.foo(bar) in the example would evaluate to str. In order to know that it could also be int, because of B.foo, the type checker would have to inspect all subclasses of A and C? Perhaps that's sensible, but it sounds very strange to me. It sounds like it would require most everything that's involved in an intersection to be @final in order to not break if it's subtyped downstream.

antonagestam commented 3 years ago

Like, you can't both say that "the return type of foo(bar) for A & C is of course str", and that B & C is a valid subtype of A & C.

antonagestam commented 3 years ago

The return value of foo(bar) for B & C is compatible with the return value for A & C (it is narrower).

No, it's broader, and that breaks LSP.

NeilGirdhar commented 3 years ago

No, it's broader, and that breaks LSP.

How is it broader? int & str is narrower than str.

vnmabus commented 3 years ago

@vnmabus The problem is that by only inspecting the involved types A and C the return type of val.foo(bar) in the example would evaluate to str. In order to know that it could also be int, because of B.foo, the type checker would have to inspect all subclasses of A and C?

The problem is that is not "also" in the sense of a union, it is "in addition": it is an intersection! Thus, the return type for B & C is an int AND a str (of course, for a Bar parameter, for a Foo is only int).

antonagestam commented 3 years ago

@NeilGirdhar @vnmabus Ah, I get it now, sorry for the noise and confusion.

KotlinIsland commented 3 years ago

@KotlinIsland I think you're mistaken. They cannot be overloads since that violates LSP. And I think LSP applies because A & B needs to be usable as an A or a B.

Where I come from overloads don't break LSP, consider this example:

class Base:
    def foo(self, i: int) -> int:
        ...

class Derived(Base):
    @overload
    def foo(self, i: int) -> int: ...
    @overload
    def foo(self, i: str) -> str: ...
    def foo(self, i: int | str) -> int | str: ...

Derived is a perfectly valid Base, yet it has overloads.

When applying this to intersections:

class A(Protocol):
    def foo(self, i: int) -> int: ...

class B(Protocol):
    def foo(self, i: str) -> str: ...

class AB:
    @overload
    def foo(self, i: int) -> int: ...
    @overload
    def foo(self, i: str) -> str: ...

    def foo(self, i: int | str) -> int | str:
        ...

a: A = AB()
b: B = AB()

This too is perfectly valid, and it is actually an intersection via protocols.

This is also exactly the same way that TypeScript handles intersections.

I can't see any way that this could break LSP.

NeilGirdhar commented 3 years ago

@KotlinIsland Combining two methods as overloads in an intersection would break LSP here. In other words, if the parameter types are the same, but the return types are different.

KotlinIsland commented 3 years ago

TypeScript has this exact problem:

type Base = {
    foo(): number
}

type Derived = Base & {
    foo(): string
}

function func(d: Derived) {
    const result = d.foo() // const result: number
}

And here's an experimental PR to resolve it: https://github.com/microsoft/TypeScript/pull/30520

NeilGirdhar commented 3 years ago

@KotlinIsland Excellent link! This greatly supports @vnmabus's argument.

scranen commented 3 years ago

Is this summary correct so far?

In an intersection type A & B:

Normally in Python, methods must be unified if their name is the same
When using overloads, you only have to unify if name and argument types are the same. Edit I think @antonagestam's example shows that in the case of overloads you also need to unify if every pair of argument types has a subtype/supertype relation (with equality as a special case)
Unified methods should accept as argument types the unions of the respective argument types of the unified methods (without overloads: unifying foo(bar: Bar) and foo(baz: Baz) should result in foo(brz: Bar | Baz); with overloads this is always the union of two equal types so the union disappears)
Unified methods should return the intersection type of the unified methods' return types (unifying foo() -> Bar and foo() -> Baz should result in foo() -> Bar & Baz
Union of argument types + intersection of return types guarantees preservation of LSP
The fact that some intersection types are (set-theoretically) empty is irrelevant; it just means you will not be able to construct an element of that type because the type checker will always complain about something. (In fact this is quite elegant: you can write code that does crazy stuff with nonexistent types like str & int, but because the type is empty, this is basically safe due to reductio ad absurdum)

I'm not sure yet whether I understand the idea of InheritsFrom. I feel like there may be mixing two concepts here: using A & B as a type descriptor vs using it as a type constructor? In my mind the distinction between protocols and concrete types does not really play a role unless you are trying to construct the type A & B; describing the type is always a straightforward theoretical exercise (and may lead to a description of something that does not exist). Am I missing something?

KotlinIsland commented 3 years ago

As a side note I think it's important to point out that union and intersection types are structural, even if their components are nominal:

a: Foo & Bar = FooBar()
b: Foo = a
c: Foo | Bar = b

In this example Foo & Bar structurally extends Foo, and Foo structurally extends Foo | Bar, even though Foo and Bar are nominal types(classes).

An application of this idea highlights a difference in Kotlin and Swift: in Kotlin 'nullable' types are implemented as a union:

val a: Int? = null
a as String? // valid because Int? and String? overlap on null

Where as in swift 'Optional' types are implmented as an enum:

let a: Int? = nil
a as String? // invalid, a is an instance of `Optional<Int>` not `Optional<String>`

Optionals implementation in Swift:

enum Optional<Wrapped> {
  case Some(Wrapped)
  case None
}

erictraut commented 3 years ago

The behavior of union is well defined and intuitive for nominal types, but intersection is not. I'm going to make the claim that intersection makes sense only for structural types (protocols). Are you proposing to limit intersections to structural types? Or are you proposing that an intersection operates on the "structural equivalent of a nominal type"?

Another consideration is how to handle type variables used within an intersection. Is this allowed? Type variable constraint solving for unions is already difficult (and full of complex heuristics) when type variables are included in the union. Working out the rules for intersections that include type variables would be even more problematic.

KotlinIsland commented 3 years ago

Nominal Intersection types have been successfully implemented in Scala and Ceylon

trait Resettable:
  def reset(): Unit

trait Growable[T]:
  def add(t: T): Unit

def f(x: Resettable & Growable[String]) =
  x.reset()
  x.add("first")

This has also been proposed to be denotable in Kotlin where nominal intersection types are already supported in the type system, but are just non-denotable in source code.

KotlinIsland commented 3 years ago

In fact, Python seems to be in the same boat as Kotlin where nominal intersection types are supported by the type system, but are just non-denotable, consider:

a: Foo
if isinstance(a, Bar):
    reveal_type(a)  # Revealed type is "<subclass of "Foo" and "Bar">"

carljm commented 3 years ago

union and intersection types are structural, even if their components are nominal

I don't think this is true, certainly not for unions. If we have:

class A:
    def a(self) -> int: ...

class B:
    def b(self) -> int: ...

class C:
    def a(self) -> int: ...
    def b(self) -> int: ...

x: A | B = C()

The assignment on the last line should not be accepted, and no Python type checker accepts it; if all unions were structural types it would have to be accepted.

KotlinIsland commented 3 years ago

That's not what I mean, the components are still nominal, but the union itself is structural. Check the Kotlin/Swift example to understand what I mean.

In python an int | None has overlap with str | None(because it's structural) but in Swift Int? has no overlap with String?(because it's nominal).

Jasha10 commented 3 years ago

In python an int | None has overlap with str | None(because it's structural) but in Swift Int? has no overlap with String?(because it's nominal).

@KotlinIsland I believe you are using the words "structural" and "nominal" in a different way than how they are used in PEP 544. Based on the language in that PEP, the the type hint int | None would be considered "nominal", whereas the Protocols defined in that PEP could be used to give a "structural" type hint.

carljm commented 3 years ago

I'm going to make the claim that intersection makes sense only for structural types (protocols).

@erictraut by "makes sense" do you mean "useful enough to consider adding to the type system" or "well-defined"? I think nominal intersections are well-defined: A & B (where A and B are nominal types) is satisfied by class C(A, B): ....

carljm commented 3 years ago

the components are still nominal, but the union itself is structural

I don't know what this means; it doesn't seem to be using the terms nominal and structural in a way that I'm familiar with, and the Swift/Kotlin examples don't clarify.

EDIT: your following comment does clarify what you mean, thanks!

KotlinIsland commented 3 years ago

I believe you are using the words "structural" and "nominal" in a different way than how they are used in PEP 544.

Not really, they are the same concept, just in different forms, consider the following pseudo code:

class A(Protocol):
  a: int
  b: str
class B:
  a: int
  b: str
  c: dict
union Union1:
  int
  str
  dict
union Union2:
  int
  str

In the same way that structural types are subtyped by their members and types, structural unions are subtyped not by name, but by their members. Such that B is assignable to A - Union2 is assignable to Union1. If unions were nominal then the following would be invalid:

Foo = int | str
Bar = int | str | dict
a: Foo = 1
b: Bar = a

I think the way that Swift(and Rust I think) handle optional types really highlights this concept. Int? is a type that represents a value that could be either an Int or nil and String? could be a String or nil but they are described in a nominal way such that even though both types can be assigned the exact same value(nil), they are still entirely incompatible with each other.

DetachHead commented 3 years ago

In fact, Python seems to be in the same boat as Kotlin where nominal intersection types are supported by the type system, but are just non-denotable, consider:
a: Foo
if isinstance(a, Bar):
    reveal_type(a)  # Revealed type is "<subclass of "Foo" and "Bar">"

if intersections are already supported it seems like it shouldn't be too difficult to add the ability to denote them

NeilGirdhar commented 3 years ago

@erictraut

The behavior of union is well defined and intuitive for nominal types, but intersection is not.

Intersection is well defined. T matches X & Y iff T matches X and T matches Y. That's probably the easiest way to implement it too.

The recent discussion in this thread is about the implications for a hypothetical T that matches X and Y when X and Y both have a homonymous method.

I'm going to make the claim that intersection makes sense only for structural types (protocols).

Why would that be?

erictraut commented 3 years ago

I see. Yes, it's possible to realize a class that satisfies an intersection of nominal types using multiple inheritance. That assumes, as you've said, that there are not attributes or methods with overlapping names and conflicting types. It also assumes that there's not an MRO conflict, metaclass inconsistencies, or other compatibility issues that prevent the classes from being used together as base classes.

KotlinIsland commented 3 years ago

@erictraut All of this is already considered in mypy/pyright for their implementations of intersections:

class Foo:
    def foo(self) -> int: ...

class Bar:
    def foo(self) -> str: ...

a: Foo
if isinstance(a, Bar):  # error: Subclass of "Foo" and "Bar" cannot exist: would have incompatible method signatures
    reveal_type(a)

erictraut commented 3 years ago

I can't speak for mypy or the other Python type checkers, but pyright's implementation of isinstance type narrowing implements this case in a way that won't generalize in the way you seem to think it will. Pyright's type narrowing code synthesizes a new class type that subclasses from the two classes, which is a reasonable approximation of an intersection. This approach is sufficient to handle common use cases for a type guard, but it will not suffice if intersection types can be described in type annotations. Adding generalized support for intersection types would require a significant amount of work in all type checkers.

KotlinIsland commented 3 years ago

I understand that, I'm just saying that a lot of this groundwork has already been done, it's not like it needs to be done from scratch.

erictraut commented 3 years ago

And I'm trying to explain that almost no groundwork has been done, at least in the case of pyright. The special-case handling for isinstance type narrowing is just a few dozen lines of code. It could be replaced by a generalized solution, but that generalized solution will be expensive to implement.

patrick91 commented 3 years ago

Would intersection work with TypedDicts too?

mwerezak commented 3 years ago

How is intersection not well defined for nominal types? It seems pretty clear to me that A & B would include all nominal types that inherit from both A and B.

In my case, I ran into the need for an intersection type while working with mixins.

On Thu, Oct 28, 2021, 5:16 AM Patrick Arminio @.***> wrote:

Would intersection work with TypedDicts too?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/python/typing/issues/213#issuecomment-953663808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3F7BYIJITMJ662AQBRWTUJEPJNANCNFSM4CDC4G4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

JelleZijlstra commented 2 years ago

Introducing a full Intersection type would greatly complicate type checkers and wouldn't work cleanly with a nominal type system like Python's. I wonder if we could instead implement a more limited Intersection, say limited to the following use cases:

TypedDict1 & TypedDict2 is a new TypedDict with all the keys in the two input types. If a key is present in both inputs with conflicting types, it is an error. This should be simple to implement (since TypedDict typing is already structural).
Protocol1 & Protocol2 is a new Protocol supporting all operations in the two input protocols. Similarly, it is an error if there is a conflicting type. This is already possible by creating a class subclassing the two protocols, but dedicated syntax would be simpler.
NominalType1 & Protocol1 is a new type that is mostly nominal, but also has the structural attributes from Protocol1. This would allow typing class decorators that inject attributes into the class, so it would allow a small part of the semantics of decorators like @dataclass to be expressed in the type system.

srittau commented 2 years ago

I believe that especially the first two points alone would be a great usability improvement, especially ad-hoc combining of protocols. The only thing I wouldn't want is if such a proposal would close the design space for full intersection types using the & operator. I.e. such a proposal should be forward compatible.

JelleZijlstra commented 2 years ago

The only thing I wouldn't want is if such a proposal would close the design space for full intersection types using the & operator.

I'd imagine we'd introduce an Intersection special form in typing-extensions and a & operator in Python 3.1x, but specify that type checkers would need to support only the cases listed above. Then a future PEP could extend the semantics of Intersection.

superbobry commented 2 years ago

FWIW @kmillikin, @patrick91 and me are working on a PEP for intersection types. There is no draft we are ready to share just yet, but we hope to have something more or less readable by the EOY.

@JelleZijlstra could you elaborate on how intersection types would complicate type checkers and why intersections of nominal types are problematic?

JelleZijlstra commented 2 years ago

Great to hear, I didn't know there was work on a PEP already!

Intersections would be a new concept interacting with every aspect of the type system, so every place in a type checker that does case work on different kinds of types will have to be updated. That's doable, but seems like a lot of work.

Intersection types do make some logical sense with nominal typing because multiple inheritance is possible, but multiple inheritance is generally rare, so intersection types feel like a better fit for a structural type system (as in TypeScript). I see that some other people have brought up use cases with mixins, though, so clearly there's some demand for nominal intersections too. That's something we can discuss when the PEP arrives.

joelberkeley commented 2 years ago

Introducing a full Intersection type ... wouldn't work cleanly with a nominal type system like Python's.

IIUC, intersection types in Scala 3 came out naturally as a result of Odersky's Dotty, which includes nominal types.

NeilGirdhar commented 2 years ago

Intersection types do make some logical sense with nominal typing because multiple inheritance is possible,

It's not just about multiple inheritance. Intersections come up whenever you need to use an object in two ways:

def g(x: A) -> None: pass

def h(x: B) -> None: pass

def f(x: A & B) -> None:
    g(x)
    h(h)

Yes, that includes the use of mixins, but even just two interfaces is enough.

It appears that MyPy already has some support for keeping track of intersections even though you can't specify them:

class X: pass
class Y: pass
x: X
assert isinstance(x, Y)
reveal_type(x)  # Revealed type is "a.<subclass of "X" and "Y">"!!

This is pretty impressive because the last time I checked, MyPy would reveal the type as Y 😄

patrick91 commented 2 years ago

I guess another usage is when using Protocols and you want to make sure a type conforms to protocols (withouth having to create a third one)

mwerezak commented 2 years ago

Great to hear, I didn't know there was work on a PEP already!

Intersections would be a new concept interacting with every aspect of the type system, so every place in a type checker that does case work on different kinds of types will have to be updated. That's doable, but seems like a lot of work.

Seems like an argument that intersection types should be included in the design of a type checking system from the start rather than anything about intersection types itself.

JukkaL commented 2 years ago

Seems like an argument that intersection types should be included in the design of a type checking system from the start rather than anything about intersection types itself.

I don't follow this argument. Can you elaborate? It think that the more difficult a feature is to implement/maintain, the higher the bar should be for including it. At the start of a project it seems reasonable to give priority to features with the most favorable cost/benefit ratios.

flying-sheep commented 2 years ago

I wonder if we could instead implement a more limited Intersection, say limited to the following use cases

You’re missing the common use case described in https://github.com/python/typing/issues/213#issuecomment-953454701, Nominal1 & Nominal2:

a: Foo
if isinstance(a, Bar):
    reveal_type(a)  # Revealed type is "<subclass of "Foo" and "Bar">"

mccolljr commented 2 years ago

I commented in #801 about the Not construct, but I believe this might also be a case where an intersection would be useful. Rather than add a Not[...] bound to a TypeVar, I believe changing the Callable[..., T] portion of the type signature I mention to Intersection[Callable[..., T], TheAppropriateMetaclass] would be a solution.

vidhanio commented 2 years ago

This is a feature that is surprisingly missing from the Python type system, and should definitely be worth taking a look at again.

superbobry commented 2 years ago

We are still working on a PEP for intersections with @kmillikin and @patrick91 and we hope to be able to share a draft it soon.

zmievsa commented 1 year ago

@superbobry If you guys need any help with this PEP (research/writing/proofreading/implementation), I'd love to provide it. Just email/ping me if you need anything.

The semantics for this feature that @JelleZijlstra described seem very doable for type checkers and cover all of my personal pains.

python / typing

Introduce an Intersection #213