Intersection PEP - Githubissues

CarliJoy / intersection_examples

Python Typing Intersection examples

MIT License

33 stars 2 forks source link

Intersection PEP #49

Open mark-todd opened 7 months ago

mark-todd commented 7 months ago

PR relating to https://github.com/CarliJoy/intersection_examples/issues/48. Still some work to be done on this, but feel free anyone to have a review if you'd like @CarliJoy @erictraut @NeilGirdhar @gvanrossum @DiscordLiz @vergenzt . It builds a lot on the work of @mikeshardmind, but I've tried to thin it down a bit and add some more examples. It also approaches the problem from a slightly different angle conceptually, but the result is somewhat similar.

Broadly the main update I'm planning is to finish the intersect simulator and link it as the reference implementation (Proof of Concept I suppose), then we can move to opening a discussion related to the PEP on python discuss.

I'll also add something discussing the equivalence of Callable to certain Protocols and maybe add an example.

Finally, this issue is still outstanding: https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981, but my hope is that this may not be critical to finalising the PEP. The behaviour as far as Intersection is concerned is now well defined under the PEP, and I'm thinking that if this proves to be a major issue perhaps it will arise during later discussion.

mark-todd commented 7 months ago

I've now updated to resolve the issues above - any comments please take a look. If there are no further comments, next week I'll read it over again, then open an issue on the discuss forum for a draft PEP.

The main thing I think this might be missing is potentially some of the rejected ideas (as there were many!), but I've tried to summarise any major ones.

As far as I know, the implementation and specification are complete in that it seems to resolve all outstanding issues I'm aware of - if there's anything I've missed here please let me know.

Quite excited to think we may actually have a fully fledged specification - getting there!

DiscordLiz commented 7 months ago

This approach has the same flaw with __getattr__ that the prior did, it's just hidden differently. You'll want to explore the test case that led to the prior approach falling apart: https://github.com/python/typing/pull/1672#discussion_r1550639648

@ mikeshardmind had a simple formulation that still works with Any, but based on his most recent messages here, it seems like he's not going to advocate for the simpler option remaining. As I understand it, he'd rather intersections not be added until the underlying issues are resolved.

A :> Intersection[T1, T2, ... Tn] if and only if A :> T1 and A :> T2 .... and A :>Tn A <: Intersection[T1, T2, ... Tn] if and only if A <: T1 and A <: T2 .... and A <:Tn

That's the entire definition of what can and can't be inferred from an intersection expressed in 2 short lines describing subtyping and supertyping relationships. He didn't like the implications of trying to combine this with a bad definition of Any and tried to work around that. It being independent of the definition of Any and having no special casing might mean we can ignore Any having a bad definition and just get something that works for most people. If it doesn't work for someone, they can pick up improving the definition of Any from the case where it doesn't work.

mark-todd commented 7 months ago

@DiscordLiz I added this section: https://github.com/CarliJoy/intersection_examples/blob/intersection-pep-new/docs/specification.rst#any-subclassing-rules

to try and resolve this. Broadly my thought it that by relaxing the inheritance restriction on Any by making it a structural type, it doesn't matter if Intersection diverges from the way subclassing with Any currently works. There is a potential ambiguity in the way that methods with __getattr__ and things that inherit from Any work, but by considering C and D below as different types we go around this issue:

class A:
    foo: str = "foo"

class C(Any, A):
    pass

D: Intersection[Any, A]

There nothing to say that objects of type C and D above have to behave in exactly the same way. D has all attributes as Any and must inherit from A, while C inherits from A and has all attributes as Any except foo which is str. In this case C is a subtype of D, but D is not a subtype of C.

DiscordLiz commented 7 months ago

There is a potential ambiguity in the way that methods with getattr and things that inherit from Any work, but by considering C and D below as different types we go around this issue:

We were always considering an intersection as different to a specific class declaration. Simply saying we do this doesn't solve the problems that were raised around this.

I don't know how to explain this any better, you're still conflating parts of the type system that should not be conflated after multiple explanations throughout the discussions, and it seems to be having an impact on what you are writing ending up missing the problem. Maybe someone else can explain better.

What type checkers need are the subtyping rules. I have seen no indication that the extra complexity you have here solves any problems. It does creates new ones, treating Any as a structural type rather than Any means you need to define it's behavior as a structural type without breaking any more fundamental rule and having a decidable interface and rules for how type checkers address that in an intersection. You have not provided something that accomplishes that.

The focus on splitting inheritance and structural types has you missing the more fundamental rule that both must abide by. Both of these imply rules for subtyping relationships. It would be better to focus on a consistent subtyping relationship that does not try to treat either of these differently. You can refer to my last post for the minimal relationships to define an intersection. I think any proposal that does not use these needs to show definitively why it behaves better. The prior attempt did not actually behave better than those, and was abandoned.

You also retained a section on ordering, but intersections don't have to be ordered at all, that was a synthetic requirement of the previous attempt that we showed definitively doesn't solve the problem, it's unclear why you've retained this as none of the new language resolves those issues either.

mark-todd commented 7 months ago

I don't know how to explain this any better, you're still conflating parts of the type system that should not be conflated after multiple explanations throughout the discussions, and it seems to be having an impact on what you are writing ending up missing the problem. Maybe someone else can explain better.

@DiscordLiz Apologies, I am trying to understand this - what parts of the type system am I conflating?

What type checkers need are the subtyping rules. I have seen no indication that the extra complexity you have here solves any problems.

Broadly, I'm trying to define a set of rules that would be easy to implement for type checkers. I think the sets of rules you define here:

A :> Intersection[T1, T2, ... Tn] if and only if A :> T1 and A :> T2 .... and A :>Tn A <: Intersection[T1, T2, ... Tn] if and only if A <: T1 and A <: T2 .... and A <:Tn

...are not in conflict with the specification, but I think they don't cover all situations. We've discussed before non-LSP violating method clashes that can occur, and these rules don't really show how to resolve that.

The focus on splitting inheritance and structural types has you missing the more fundamental rule that both must abide by. Both of these imply rules for subtyping relationships.

This is true - I'm really trying to break down what it means to be a subtype into rules that will be easier to implement, and resolve unspecified cases at the same time. Also, I think the statement about Any being a structural type may already be effectively present. Suppose we had:

class A:
    def foo(self):
        pass

def test(x: T) -> Intersection[A, T]:
    class New(type(x), A):
        pass
    return New()

class B:
    def bar(self):
         pass

b: Any = B()

x = test(b) # Intersection[Any, A]

So here we have created an Any intersection, as let's say B was an unknown type.

Based on the ruleset: A subtype of x must be a subtype of A and a subtype of Any

Subtype of A

A subtype of A, must inherit from A and have attribute foo.

Subtype of Any

A subtype of Any, must inherit from Any and have all attributes as Any

Now here we have an issue - B does not inherit from Any, and neither does x. My point is that subtypes of Any do not have the inheritance restriction, so it seems it's already been defined as a structural type.

If there is a PEP somewhere I've missed that lays out what it means to be a subtype I'd be interested to read it.

You also retained a section on ordering, but intersections don't have to be ordered at all, that was a synthetic requirement of the previous attempt that we showed definitively doesn't solve the problem, it's unclear why you've retained this as none of the new language resolves those issues either.

I've summarised a few of the issues the ordered nature resolves: https://github.com/CarliJoy/intersection_examples/blob/intersection-pep-new/docs/specification.rst#ordering

we showed definitively doesn't solve the problem

Could you point me to this? I know there was discussion about the issue with __getattr__ and Any, but there are other advantages to it being ordered.

DiscordLiz commented 7 months ago

Could you point me to this? I know there was discussion about the issue with getattr and Any, but there are other advantages to it being ordered.

Type information is used for more than just static type checking. It’s also used for runtime type checking and for edit-time features like completion suggestions and signature help. For every Python developer who uses pyright for static type checking, there are 40x as many developers who do not use static type checking but rely on static type evaluation for language server features. These developers are more likely to be using untyped or partially-typed code, so they will more often hit the case where a class has an unknown base class. If we were to change the spec to treat all methods and attributes of such a class as Any, it would significantly harm their user experience — enough so that I don’t think I could justify making such a change in pyright regardless of what the typing spec says.

From a message explaining why a spec that just turns everything into Any in such types would be harmful to users: https://discuss.python.org/t/take-2-rules-for-subclassing-any/47981/5

It's also not even correct to do so.

class A:
    foo: int

B = Any

C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()

c.foo  # int

The ordering was only ever a bad compromise, and the reason Any was shuffled to the end in the attempt at that was to specifically avoid this issue. foo here can't be Any, as any type that isn't int would have been a type error.

DiscordLiz commented 7 months ago

The focus on splitting inheritance and structural types has you missing the more fundamental rule that both must abide by. Both of these imply rules for subtyping relationships.

This is true - I'm really trying to break down what it means to be a subtype into rules that will be easier to implement, and resolve unspecified cases at the same time. Also, I think the statement about Any being a structural type may already be effectively present.

Any isn't a structural type. It's a concept that says "some compatible type is present here". I cannot stress enough how important the different between a higher order concept to express an uknown is different from a structural type that expresses a structure.

Casting to and from Any is a free operation that doesn't require an explicit cast or checking, the type system assumes compatibility as that's the purpose.

You don't have to break this down into "Easier to implement", type checkers already need to have the facilities to handle subtyping relationships, and doing so is causing important things to be gotten wrong, not all simplifications preserve the truth.

mark-todd commented 7 months ago

@DiscordLiz Thanks for the above - I'm working on something to express the same concepts in the current PEP in terms of subtyping, just a quick query on this bit:

class A:
    foo: int

B = Any

C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()

c.foo  # int

The ordering was only ever a bad compromise, and the reason Any was shuffled to the end in the attempt at that was to specifically avoid this issue. foo here can't be Any, as any type that isn't int would have been a type error.

Does this have to be the case? I don't see why c.foo couldn't be Any here, e.g.:

C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C()

c.foo  # Any

D: type[Intersection[A, B]] = ...
d: Intersection[A, B] = D()

d.foo  # int

What's wrong with the above?

DiscordLiz commented 7 months ago

Does this have to be the case? I don't see why c.foo couldn't be Any here, e.g.:

Any allows more than int, this is an unsafe widening of the type and should not be done by the type system. If C is a subtype of A, then C.foo can't not be compatible with int. Any is special. It's compatible but the way it is compatible is a problem to do this kind of transformation with in a way that looses known type information, becase now from the type system perspective, you're allowed to assign anything to it, even things which break the rules for things that are a subtype of A

mark-todd commented 7 months ago

I want to avoid retreading too much ground here, but with a simple modification:

class A:
    def __init__(self, x: int) -> None:
        pass

B = Any

C: type[Intersection[B, A]] = ...
c: Intersection[B, A] = C(1)

If C is a subtype of A, then C.foo can't not be compatible with int.

This statement applies in the case I originally gave, but in the case above that doesn't necessarily apply (As __init__ can take any form). By making the Intersection ordered we could just make the one with the defined __init__ take priority by switching A and B, so the issue goes away.

Any allows more than int, this is an unsafe widening of the type and should not be done by the type system

The ordering effectively shifts the responsibility to the user to make sure they don't create effects like this unless that's what they want to do. If the result should always have an attribute foo of type int, then that means this was a mistake, and the order should be the other way around. It shifts responsibility but it also shifts control to the user.

DiscordLiz commented 7 months ago

The ordering effectively shifts the responsibility to the user to make sure they don't create effects like this unless that's what they want to do.

We established we can't ban Any from intersections because the user might not have control over where they come in (typevars) The user does not have that capability to avoid this problem. The prior ordered form avoided this by moving Any last, always, but that was flawed. You're doing something far worse with this, leaving a loaded gun pointed all all users of gradual typing.

DiscordLiz commented 7 months ago

And even if the user could, why should the tools that are meant to get typing right intentionally get it wrong? That means users need to be more of an expert on type checking than type checker maintainers.

mark-todd commented 7 months ago

We established we can't ban Any from intersections because the user might not have control over where they come in (typevars)

So for typevars consider the following:

There are two ways the below function could be implemented:

class X:
    foo: int
T = TypeVar("T")
def foo(x: T) -> Intersection[X, T]:
    ...

In the above if the function is used with Any, then the result will always have the attribute foo.

class X:
    foo: int
T = TypeVar("T")
def foo(x: T) -> Intersection[T, X]:
    ...

In the above if the function is used with Any, the result may not have the attribute foo, if T overwrites this.

What I'm saying is that offering these options means the function designer chooses which way they want to go with it. They may not control the type of T, but they choose the order in the intersection.

And even if the user could, why should the tools that are meant to get typing right intentionally get it wrong? That means users need to be more of an expert on type checking than type checker maintainers.

I'm not so convinced that there is a correct order for these objects. And the PEPs how to teach section puts a large emphasis on the ordering being integral to the way this is taught, so really to use it at all a user would be familiar with this concept.

mark-todd commented 7 months ago

You're doing something far worse with this, leaving a loaded gun pointed all all users of gradual typing.

Is there a particular use case you're imagining here that would negatively impact these users? I've thought about quite a few different ways this could be used, but so far haven't found anything drastic - I'd be interested to know if there's something I haven't considered.

DiscordLiz commented 7 months ago

Is there a particular use case you're imagining here that would negatively impact these users? I've thought about quite a few different ways this could be used, but so far haven't found anything drastic - I'd be interested to know if there's something I haven't considered.

Returning an intersection to a user, as in mixins and class decorators, the only cases we have for intersection that can't already be solved with just protocols.

mikeshardmind commented 7 months ago

I think this should be required reading for anyone working on this, as it is another language with gradual typing adding a full type system. https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradual-typing/, running into the same problems we have with Any (with Elixir's dynamic), and finding solutions that preserve useful types for users.

I also think I did a bit of a disservice by stepping away when I did without better detailing the problems of the ordered intersection, but I needed to take a few days for myself.

It is possible to reduce typechecker complexity to remain linear rather than polynomial here without introducing an ordering or giving up on the subtyping-based outcome, it's just going to be more complicated to work on. That benefit was not unique to having an ordering, it was just more obviously available as an easy optimization that could be made if the ordering was sound. The ordering isn't sound.

If you want to add intersections without user denotable type bounds, cutting to the chase:

T & Any is irreducible.
x where x is T & Any must not violate the rules of T even with the abilities of Any, it is a logical and
Callable signatures should not be reduced either. (int) -> int & (str) -> str should be left as-is, then you only need rules for evaluating if a function signature is compatible with each (again, a logical and)

With user-denotable type bounds, there's more powerful ways to express this that allow avoiding some sharp edges around Any.

mark-todd commented 7 months ago

I think this should be required reading for anyone working on this, as it is another language with gradual typing adding a full type system. https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradual-typing/, running into the same problems we have with Any (with Elixir's dynamic), and finding solutions that preserve useful types for users.

@mikeshardmind Thanks, I'll have a read

It is possible to reduce typechecker complexity to remain linear rather than polynomial here without introducing an ordering or giving up on the subtyping-based outcome, it's just going to be more complicated to work on. That benefit was not unique to having an ordering, it was just more obviously available as an easy optimization that could be made if the ordering was sound.

I can see that there might be a way to resolve an unordered intersection in linear time, but the simplicity of implementing an ordered intersection is still there.

As you noted in your closing points though the __init__ clashes are still an issue, and currently only solved by ordered intersection. I also have a feeling that there will be lots of code that features LSP violations, as this is quite a new feature, so it might be handy to be able to support these too.

Callable signatures should not be reduced either. (int) -> int & (str) -> str should be left as-is, then you only need rules for evaluating if a function signature is compatible with each (again, a logical and)

We're also back to this whole thing of evaluating when Callables are compatible, which may be not trivial. Secondly, I'm not sure how this intersection of callables would be implemented, we end up back at how to generate overloads when merging classes.

The ordering isn't sound.

As far as I can tell there's a whole heap of issues that get solved with ordering, and the problems it creates seem pretty minimal. At least, I haven't seen any compelling examples against it.

mikeshardmind commented 7 months ago

We're also back to evaluating when Callables are compatible, which may be not trivial. Secondly, I'm not sure how this intersection of callables would be implemented, we end up back at how to generate overloads when merging classes.

It's simple, and in trying to map this to overloads we (myself included) made it more complicated than it needs to be.

Don't generate overloads, they can't be equivalent in the absence of type negation.
For each callable in the intersection, individually check compatibility (Already something type checkers need to be able to do)
If any fail, it's not compatible.
Optionally short circuit on first failure (type checker choice, this is performance vs error message not correctness)

As far as I can tell there's a whole heap of issues that get solved with ordering, and the problems it creates seem pretty minimal. At least, I haven't seen any compelling examples against it.

I don't mean to be rude here, but the entire reason why there was such a long discussion on Any & T, as well as why the attempt to come up with a clever way to hide the problem with reordering Any before, were the problems with this. There's no way to discuss this without leaving out prior nuanced arguments or feeling like we're rehashing ground that I imagine people are tired of rehashing as I certainly am.

There are multiple reasons, and I'm not going to rehash all and the rationale of each or provide specific cases

In the face of known type information, erasing it as Any erases things that people expect the type system to know about and check for them, so this harms people using this for static analysis. (detectable False negatives on typed code)
The majority of users of typing are not writing types, but language server users. Having type information vanish in an intersection harms language servers providing accurate completion hints, so this harms those who do not have strong knowledge about typing and are indirect consumers of it.
Widening to Any without a user explicitly doing so is theoretically unsound, so this harms all future attempts to improve the soundness and expressiveness of the type system.

The only group not negatively impacted by this would be the people who do not use typing at all, and I'm not even sure that's true, as they may end up with worse libraries they use, but don't use the typings of, especially if library authors have to do gymnastics to make this work how they expect it to.

The only benefit to this appears to be making it easier to implement. While making something easier to implement is usually a laudable goal, care must be taken to ensure that it does not consume the usefulness of what is being implemented.

I can understand wanting to have something to show for this and get it done, but getting this wrong could be disastrous with how many things this has the potential to interact with. Other than HKTs, this is probably the single most complex thing that is being seriously proposed for python's type system. It's not an easy task to make sure everything we have to consider is being given proper weight.

With all of that said, I don't want to say any of this and leave you without a way forward either, so....

I think there are only 3 paths forward that remain viable right now, but if you find another we can certainly discuss it.

Go with a strict subtyping-based rule. (@DiscordLiz provided the necessary rules above, there are a couple of corollaries from it that may be useful to type checkers, some of which can be found in #22 )
Improve on the surrounding parts of the type system, then come back to this with an easier-to-work-with foundation.
Have intersections only work for disjoint types for now (This implies either constraining Any or banning Any) and revisit extending what is allowed with the benefit of seeing what people still have use cases for beyond this.

My opinion is that while any of those 3 would be fine,

would have the strongest immediate benefit to users without compromising on important issues
would be the best longterm outcome, but could push this into something that never gets worked on because it feels like there is an insurmountable barrier to contribution.
would be easy to be correct, but would be very limiting on typevariable use if users can't denote certain requirements

mark-todd commented 7 months ago

@mikeshardmind Thanks for breaking this down, appreciate it. So I'll park the discussion about Any for now - although I don't understand why expanding to Any where more information is available is so critical, I can accept that this may be a requirement.

The path forward I'm not sure I understand why it was rejected is the one you were proposing in the previous PEP edition (where Any is the lowest priority). I appreciate the __getattr__ case might present an issue, but I did have another thought which is based on the idea of a "fallback".

Based on certain conditions, if Any is present in a intersection Any becomes the "fallback" which is used when all other objects in the intersection do not have the required attribute. There might be another situation where the __getattr__ return type is the "fallback" (which takes priority would be determined by the desired outcome).

As you noted in your closing points though the __init__ clashes are still an issue, and currently only solved by ordered intersection.

Also, I'm still not sure how this could be resolved in an unordered intersection, which is why I'm keen to resolve issues with the ordered one. Banning an intersection with clashing __init__ would be one way to go, but I have a feeling this would lead to many more issues. This reason alone I think is quite compelling, all others aside.

mikeshardmind commented 7 months ago

The path forward I'm not sure I understand why it was rejected is the one you were proposing in the previous PEP edition (where Any is the lowest priority). I appreciate the getattr case might present an issue, but I did have another thought which is based on the idea of a "fallback".

I determined that getting the special casing right on this would be harder and more fragile than it was worth. (specifically, it would be harder than getting the full theory-drive intersection correct while keeping it useful to a wide audience, and would have issues of fragility)

Every level of special casing needed here makes it more and more complex, harder to reason about, and more sensitive to other changes/additions to the type system. The moment it was shown that simply re-ordering Any to the end was insufficient, any additional layered special casing was more complex than just doing it "right", the ordering was meant as a means to approximate the same behavior without needing as much to be specified to simplify things and it stopped being simpler as we discovered more issues, which I took as a sign (for myself) to take a break and come back to it later.

As you noted in your closing points though the __init__ clashes are still an issue, and currently only solved by ordered intersection.

Also, I'm still not sure how this could be resolved in an unordered intersection, which is why I'm keen to resolve issues with the ordered one. Banning an intersection with clashing __init__ would be one way to go, but I have a feeling this would lead to many more issues. This reason alone I think is quite compelling, all others aside.

clashes of constructors (__init__, (metaclass) __call__, and (metaclass) __new__) are already special-cased (much to my disappointment), so a special case around this, in particular, is unnecessary, and would already be covered by the existing ones for subtyping rules. While I'm disappointed in the special casing, it's very too late to undo that, and recent improvements to the specification on this do manage to avoid most issues and I think it's nearing a pragmatic enough point as to be a non-issue.

mark-todd commented 7 months ago

I determined that getting the special casing right on this would be harder and more fragile than it was worth. (specifically, it would be harder than getting the full theory-drive intersection correct while keeping it useful to a wide audience, and would have issues of fragility)

@mikeshardmind I see, that makes sense, but I'm not sure I agree.

Every level of special casing needed here makes it more and more complex, harder to reason about, and more sensitive to other changes/additions to the type system. The moment it was shown that simply re-ordering Any to the end was insufficient, any additional layered special casing was more complex than just doing it "right", the ordering was meant as a means to approximate the same behavior without needing as much to be specified to simplify things and it stopped being simpler as we discovered more issues, which I took as a sign (for myself) to take a break and come back to it later.

The way I see it, there are really only two special cases we've identified so far for an ordered intersection: Any, and __getattr__. If there are many more of these that arise, then I think I agree, but this doesn't seem like a dealbreaker for it as things stand (__init__ doesn't need to be a special case in ordered intersection).

If we look at the special cases for unordered intersection on the other hand, there are many: Any, signatures for __init__, __call__, __new__, Callables (although there is now a defined logic for resolving this above), other non-LSP violating methods of same name (I imagine this would resolve similarly). Some of these special cases may have existing logic in python to handle them, but that still makes it a much more complex specification. What's more, I believe __getattr__ would still end up being a special case in an unordered intersection, it'd just be get handled differently.

Special cases will arise in any specification like this, but the ordered version seems to have fewer and is easier to explain the logic for, as for any conflict there is a default way to resolve it. In the unordered version, everytime two classes "compete" for the same attribute or method, a resolution method has to be determined, so by it's nature it must be more complex to specify.

mark-todd commented 7 months ago

I also think it's worth noting that waiting for a lot of pre-requisites to be resolved internally is quite unrealistic. Even if right now all of the pre-requisites appear on a TODO list, any one of these could be delayed, and it's unlikely all of them will come to fruition. This is a tool that I personally think could be very useful, and we're incredibly close to a specification that would be implementable. It seems a shame to me to park the discussion for a future point that will in all likelihood never happen.

mikeshardmind commented 7 months ago

If we look at the special cases for unordered intersection on the other hand, there are many: Any, signatures for init, call, new, Callables (although there is now a defined logic for resolving this above), other non-LSP violating methods of same name (I imagine this would resolve similarly). Some of these special cases may have existing logic in python to handle them, but that still makes it a much more complex specification. What's more, I believe getattr would still end up being a special case in an unordered intersection, it'd just be get handled differently.

None of these require special-casing the specification with option 1 or 2 of the 3 options I listed for going forward

I also think it's worth noting that waiting for a lot of pre-requisites to be resolved internally is quite unrealistic. Even if right now all of the pre-requisites appear on a TODO list, any one of these could be delayed, and it's unlikely all of them will come to fruition. This is a tool that I personally think could be very useful, and we're incredibly close to a specification that would be implementable. It seems a shame to me to park the discussion for a future point that will in all likelihood never happen.

Only option 2 in the list of 3 options above requires temporarily shelving the discussion. Shelving it for that is to prevent other unfixable issues down the road that are similar to what we are facing now with decisions about the type system intentionally ignoring LSP because that was deemed easier.

If intersections got accepted with ordering rather than with either purely subtyping (possible now) or with denotable lower and upper bounds and constraints (would require groundwork), I think it would create a situation where HKTs would be impossible to support properly, and that's a gigantic detriment. HKTs are significantly more important to people than intersections are, they are also just more complex and not as easy to tackle in the current landscape.

If you think this is something that needs to be finished now, go with the subtyping approach then.

mark-todd commented 7 months ago

Ok, let's suppose we go with option 1 with the subtyping rules. If something is a subtype of T & Any it must be a subtype of T and a subtype of Any. We're still back to the old 4 vs 5 debate from a few months ago, on if an attribute of the irreducible quantity can be said to follow type T if it's present on T. Which means we're kind of back to square 1.

If intersections got accepted with ordering rather than with either purely subtyping (possible now) or with denotable lower and upper bounds and constraints (would require groundwork), I think it would create a situation where HKTs would be impossible to support properly, and that's a gigantic detriment. HKTs are significantly more important to people than intersections are, they are also just more complex and not as easy to tackle in the current landscape.

This is interesting. HKTs would be cool - is there any possibility of this happening in reality though? If there is and an ordered intersection presents a blocker (although I don't see why) then that is something to consider. I guess my follow up is why does it present a blocker?

My main reasoning is that if the only real blocker to an ordered intersection is that __getattr__ becomes a special case, that's not much of an argument when the alternative is trying to resolve attributes of T & Any which may be unresolvable.

I also still don't really understand why __getattr__ isn't a special case for unordered too - with ordered we were saying that the ordering priority may be reversed in some cases with this attribute, but with unordered we still need to define which item in the intersection takes priority. E.g.

from typing import Any, cast

class A:
    def __getattr__(self, item: str) -> int:
        return 1

B = Any

class C:
    foo: str

x= cast(A & B & C, type("D",(B,A,C),{})())
x.foo # str I suppose
x.bar # What type goes here?

mikeshardmind commented 7 months ago

It was my intent to after taking some more time for myself, to dig into this vs explicit denotable lower and upper type bounds, I still intend on that at some later point, but we painted ourselves into a corner we didn't need to with some of the original discussions, and looking elsewhere for answers and perspective on why certain attempts at simplification were misguided and where they fell short of making things simpler has helped

If something is a subtype of T & Any it must be a subtype of T and a subtype of Any. We're still back to the old 4 vs 5 debate from a few months ago, on if an attribute of the irreducible quantity can be said to follow type T if it's present on T. Which means we're kind of back to square 1.

This, just like callable is simpler than we made because we were looking at it through the wrong lens at the time. The thing I linked about how Elixir is solving this is very illuminating as to how we framed the discussion made finding a solution harder than it should have been. We were working under the assumption that we must provide a single resulting type and that that type itself could not be an intersection, but there's no actual rule that that be the case.

It's irreducible in most cases, and that includes the attributes. But that makes things Easier not harder if context is preserved properly.

A.foo & B.foo can't get resolved to a single non-intersection type unless A.foo and B.foo are trivally overlapping (ie A.foo is exactly B.foo) or only one of A.foo and B.foo exist.

When using x.foo where x is a subtype of A & B, x.foo's use must be consistent with both A.foo and B.foo When assigning to x.foo, the assignment must be consistent with assignment to A.foo and B.foo

This preserves both what those in favor of 4 wanted (a wider upper bound due to the presence of Any) and preserving the lower bound provided by any known specific type (those in favor of option 5, as well as the use case of indirect users via language servers). It's also significantly simpler to implement as it doesn't require a polynomial algorithm in the case of resolving certain types, and instead only a linear one at each usage.

I also still don't really understand why __getattr__ isn't a special case for unordered too - with ordered we were saying that the ordering priority may be reversed in some cases with this attribute, but with unordered we still need to define which item in the intersection takes priority. E.g.

Because the rules for __getattr__ will continue to work as they already do. It is already a special case, but its specialness is currently only acknowledging how attribute access works in python to match the data model, and we'd avoid adding another special case for it that behaves slightly differently.

We keep the existing rules for when __getattr__ is used, and any updates to those in the specification at any later point just work, without having to consider intersections again to ever improve that.

This is interesting. HKTs would be cool - is there any possibility of this happening in reality though?

Continues to be brought up with varying interest, it's significantly more likely with the new type alias syntax and type variable defaults, as the applications are more accessible and natural to more developers, and it's been stated as a priority for a few people heavily involved in python's typing. I'd say that yes, there's a real possibility of this happening.

If there is and an ordered intersection presents a blocker (although I don't see why) then that is something to consider. I guess my follow up is why does it present a blocker?

I really don't want to try and work through an example of this right now, there's still a lot of unknowns. I forsee an issue with each an intersection of higher kinded types, and with intersections within higher kinded types if the behavior is inconsistent with other subtyping.

I can't definitely say that this would be unresolvable for HKTs, but I can definitively say that every special case that exists will increase complexity of HKTs and that any single one of those cases could potentially be the straw that was too much. There's too many unknowns with it right now, and it hasn't been explored enough to know. I'm taking the pragmatic stance that if we don't need new special cases, we should do it without them to reduce the number of things that don't "simply work" together.