CarliJoy / intersection_examples

Python Typing Intersection examples
MIT License
23 stars 2 forks source link

An argument against intersections. Or, having our cake, and eating it too. #38

Closed mikeshardmind closed 2 months ago

mikeshardmind commented 5 months ago

Note: This is An argument for the intentional non-implementation of pure intersections, and instead use all of what we worked out to implement an alternative option that satisfies all currently supportable use-cases, while leaving the door open to full intersections. This is not arguing to throw away everything we've worked on, but to use what we worked out to help our collective use cases, productively. I am actively inviting any constructive detraction to this, as if there is something these cannot do that are "must haves" to go this route would require determining if intersections are must-have to support those uses.

(edited) Note 2: Where you see type[*Ts] and object[*Ts]it may make more sense to haveOrderedIntersection` There's more details at the end on this now

With Thanks to @DiscordLiz , @erictraut , @gvanrossum, @mark-todd, and @NeilGirdhar in alphabetical order (and no other order of particularity) for presenting each extremely strong, and opinionated arguments as well as some feedback that led to some self-reflection on why I felt so strongly about particular outcomes of this, I believe that there is a strong argument that without other constructs (such as negation, pick, and omit), there is not a good argument for the inclusion of the theoretical intersection, and that even with those, it might only make sense for that to work with structural subtyping.

However, that leaves an awkward question, how can we best handle the use cases that are certainly valid without them such that I can say we don’t need intersections? Have we really just wasted months of work? No.

The pitch: The order matters to us and to the behavior we want, so lets go with an ordered construct that's simpler and still does what we need.

type[*Ts] and object[*Ts]

First of all, my apologies to @mark-todd as when this was originally floated as an idea in a slightly different form by him in Discord, I stated the non-equivalence to intersections and that that should be separate, and we went back to working on intersections, but there’s a real question here I think we need to answer: Do these not solve every case we care about sufficiently? As originally stated, I would say that they didn't, but I believe I made a mistake in viewing these as something we should consider separately instead of as something which might do enough while being simpler. Exploring the idea further led to a slight tweak in which they appeared to me to solve all presented use cases that intersections could in the context of the current type system.

If it does, we can use the existing explorations to justify this construct instead and leave notes on rationale for anyone who wants to explore the difference between this and “true intersections” at a later date.

Background

The primary difference between all interpretations of Intersection which anyone involved has stated support for (For this I'm mostly looking at 4, 5, and the not fully fleshed out, hypothetical 6 from #31, #37) comes into play in things that are defined on multiple operands. Put another way, the unordered nature of Intersection as a parallel construct to Unions has some very unpleasant consequences. This became especially contentious when looking at what happens in non-disjoint cases and (implicitly non-disjoin) cases involving Any.

I iterated on this several times trying to find a solution that satisfied all the complaints from multiple parties, but every time there was still some corner case that had to favor one use over another, and given that we wanted intersections to also take up an operator in the type system & the consequences for needing two different constructs to express one version or the other wasn’t sitting right with me. With some time spent reflecting on it, I realized my primary issue with favoring any solution over another was that by promoting one to use & and another being relegated to requiring an import, we were necessarily putting some needs above others when there was no consensus on which behavior would be more common, or how this would reconcile with Any.

Enter, an alternative: type[*Ts]

This is ordered. People want option 4 for type[Any & T]? Write type[Any, T]. Want option 5? Write type[T, Any]. This reflects existing behavior, including the ordered nature with subclassing of Any.

Semantics

type[*Ts] does not create a new type at runtime. It creates a virtual type that has the bases *Ts. The order is not semantically considered for nominal subtyping, but should be considered when handling diamond patterns. As a return type or annotation, the type returned must have the bases Ts. The bases do not need to be in the same order, but the types must resolve as if they were. `object[Ts]` is the equivalent formulation for an instance of something of that type, reflecting the duality between type and object at both runtime and in the type system.

This allows the following class definitions (and many non-trivial others)

class Foo(A, B):
    pass

#and 

class Bar(B, A):
    pass 

# but also

class Baz(A, B, C):
    pass

to each work for type[A, B] or instances of these classes for object[A, B].

This also allows all of the current assumptions of nominal subtyping to apply to this construct.

In the case of protocols, each must be satisfied.

Because of this, it is enough at assignment to check that each type in Ts for type[*Ts] is satisfied, as any real object being passed would be checked when defined for LSP. LSP does not need to be checked for at time of assignment. At time of use, lookup should be done in order. By deferring to the fact that LSP violations would be detected elsewhere, the ordered search should be fine.

Examples:

class X:
    def foo(self) -> int:
        ...

class Y:
    def bar(self) -> int:
        ...

class Z:
    def foo(self, arg: bool = True) -> int:
       ...

def goodcase1(x: object[X, Y]) -> tuple[int, int]:
    return x.foo(), x.bar() 

def goodcase2(x: object[Any, X]) -> Any:
    return x.foo("bar")  # Any has precedence

def goodcase3(x: object[X, Any]) -> int:
    ret = x.foo()
    reveal_type(ret)  # int, not Any, X has precedence
    x.bar()  # not on x, but allowed, see Any
    return x.foo() 

def badcase1(x: object[X, Z]) -> int:
    return x.foo(True)  # definition from X takes precedence for differing type of method foo
    # And any attempt to create this type would cause an LSP violation,
    # you can't subclass from both and have X's definition have precedence, so the error surfaces in multiple places.

def goodcase4(x: object[Z, X]) -> int:  
     return x.foo(True)  # definition from Z takes precedence for differing type of method foo
    # It's possible to create this type as well.

Does this really cover all of our use cases?

It covers all of mine, including potential future cases with type negation as a separate construct. I've given it a lot of thought, and it seems to cover every case I've seen floated for intersections, both in this repo and elsewhere when discussing intersections for python, but the trick with type[A, B] allowing the mro order to be B, A is mandatory for that to be the case

There are potential cases that I would be blind to here, but we’ve limited the scope of this to non-disjoint overlap of types with differing types involved in resolution, and this allows handling of even that scope to a very large degree.

There are potentially other cases that can come up with higher kinded types. I have not explored in depth what this would mean with the inclusion of higher kinded types in python, because there is no strong prevailing push for these to use as a framework to explore. The temptation to view HKTs as in one of the languages that has them exists, but there’s no guarantee that any of these would reflect what is added to python’s type system. I believe that this composes properly with any reasonable future interpretation of higher kinded types, but will refrain from saying that HKTs could not re-introduce the need for a fully featured intersection.

What about type safety vs pragmatism and warnings?

What about ergonomics and obviousness of behavior?

In the most commonly anticpated case, of disjoint types, the ordering won’t matter. The definition above only cares about the order when mro would change a resolved type. The complex cases where the ordering will matter should be obvious, as it will be the same as the required mro order when defining the type.

What about backwards compatibility with type[T] (singular type)?

Behavior is unchanged, the definition here is compatible with and does not change the behavior of the current use of type[T]

What about not using type and object here?

It's probably better to use a new typing.OrderedIntersection instead of object for this, and allow it to compose with type with the existing semantics.

That would make one of the examples using decorators look like this:

# I believe this is in here, but a protocol indicating supporting rich comparison if not
from useful_types import RichComparable  

def total_ordering(typ: type[T]) -> type[OrderedIntersection[RichComparable, T]]:  # see functools.total_ordering
    ...

# Any type wrapped in this is now known to be both a nominal subtype of the original type
# while also being a structural subtype of the protocol RichComparison.

@total_ordering
class Point:
    def __eq__(self, other) -> bool: ...
    def __lt__(self, other) -> bool:  ...

Having our cake and eating it too

It may be less satisfying for some here to have done all this work and to walk away not implementing intersections, but if we use this work to implement something that handles as many cases as this would, and which leaves the edge cases in the user's hands to pick the behavior they want, I think we walk away accomplishing everything we set out to do by adding intersections, by not adding intersections. If this works for everyone here, the work wasn't a waste, but informative of what the issues would be with an unordered intersection, and an exploration of what the scope of difference would be with different solutions. This remains available as a reference to those revisiting and (if we go with this) we should absolutely document for history "why not a true intersection".

mikeshardmind commented 2 months ago

@mark-todd

As it stands, I believe that intersections are likely to cause more harm than good if accepted without addressing the underlying issues around how we handle gradual types within the type system. I won't go as far to say that you shouldn't feel free to use what was discussed intentionally in the open to try and steer a way to salvage the effort, but I think we're at a point where attempts to do so are potentially misguided and a product of the sunk cost fallacy, so I would urge caution in actually doing so.

I think other productive things could be worked on that wouldn't be intersections, but which came about throughout these discussions and would help those who want them, such as type[*Ts] with fewer associated problems attached (see the semantics of types.new_class for what the behavior of that should be, and why type checkers can just assume it to be identical to a class declaration)


I'm burnt out on this. I was hopeful we could find some sort of compromise. I've worked for that compromise and balancing of various concerns through everything from people telling me I was too concerned with theory, to not enough on theory, to that I was unwilling to compromise*, only to conclude that right now there seems to be too much working against this feature without other things that are at the heart of why compromise seemed needed in the first place being addressed first.

* The only thing I've been unwilling to compromise on, is that I think we needed to be able to show that this would be a net positive for users, and would not make other typing features worse for users as a side effect. I think that's a pretty tame constraint when discussing being confident enough in a feature to propose it for standard inclusion.


I won't be detailing all of the fatal flaws you'd need to overcome right now (see above about burnout, I don't want to retread everything to go pick out every single minor detail that must be addressed), but a brief handful of major ones until I've had time to just let things be for a while are below.

If after reading that, you still think that's worth pursuing right now rather than in a future where some of the issues have been resolved ahead of intersections, I wish you the best of luck with it.

mark-todd commented 2 months ago

@mikeshardmind

Thanks for the summary, that's very useful. I actually had a slightly new idea last night based on a lot of what we've discussed and the PEP that's already been formulated. I have a feeling it would address a lot of these issues, but it's good to have some of them gathered in one place so I can potentially address previous discussions in the PEP.

It'll take a bit of time for me to flesh out this PEP, but I'm hopeful that there's a way forward - I wish you all the best.

I'll close this thread as it's getting quite long and I think I'll probably make a fresh start with the new one.