CarliJoy / intersection_examples

Python Typing Intersection examples
MIT License
33 stars 2 forks source link

Handling of `Any` within an Intersection #1

Closed CarliJoy closed 11 months ago

CarliJoy commented 1 year ago

There is a great deal of confusion about handling the Any type within an Intersection.

In Python, Any is both a top type (a supertype of all types), and a bottom type (a subtype of all types). Python has a gradual typing system, meaning that it will never be required that everything is typed. Everything that is not typed is considered Any.

We examine five ways to handle intersections with Any:

Remove Any from Intersections

Arguments in favour

Arguments against

An Intersection containing Any becomes Any

Arguments in favour

Arguments against

Disallow Any in Intersection

Arguments in favour

Arguments against

Treat T & Any as irreducible in general

Arguments in favour

Arguments against

Any is only considered in an intersection in deference to non-gradual types.

Arguments for

Arguments against


⚠️ Rules for contribution to this Issue

The general idea is that I will update the description, allowing the discussion to be included in the PEP and prevent a discussion going in circles.

I will react with 🚀 once I included something in the description.

NeilGirdhar commented 1 year ago

If We can show that Any is functioning as a top type for python's type system in the context of an intersection, it is still logically removable from that intersection by being a functional identity element on the operation of intersection over

As you said earlier, Any (the annotation or static type) is not a "type" in the sense that you mean it since Any implies an interface, and that interface is different than the "universal set".

As stated above, Any & T -> T comes from the same simplification as T | Never -> T, not to Any | T -> Any (removal of identity element)

I would say that object & T = T comes from the same simplification. object is a top type and the universal base class.

I'd like to discuss this a little further on what irreducibility would actually result in, as there are clearly some sharp edges to it

I think that could be productive. I tried to flesh this out in my "Consequences" section of my post.

There may be other reasons to still have it irreducible, but I'd like to ensure that the people making that point have a moment to consider the difference here.

Looking forward to seeing where this develops.

mikeshardmind commented 1 year ago

I would say that object & T = T comes from the same simplification. object is a top type and the universal base class.

Funnily enough, we proved it doesn't work for object with a simple proof by contradiction. object cannot be considered a top type because types themselves are not a member of it and participate in the type system. (this is wrong, see below)

we were also able to show that: (object | type[object]) & T =T though under current definitions (Edit: object works too)

DiscordLiz commented 1 year ago

For the record, as got discussed briefly in a discord call while I was working on an adjacent paper to come up with more formal definitions which are accurate to current use, we were able to show that it holds for a top type as well as the universal set. by reaching for category theory for answers rather than set theory applied to types.

If We can show that Any is functioning as a top type for python's type system in the context of an intersection, it is still logically removable from that intersection by being a functional identity element on the operation of intersection over categories of types, for the domain of valid python types.

While it was a highly productive conversation, that's a big if still. All use cases I could find in support of and motivating Intersection, if Any leaked in, it could only leak in while functioning as a top type, but that's a few steps away from being confident that either that's all that is possible, or alternatively, that even if it isn't, type checkers can cheaply detect this.

You also have your work cut our for you in providing both the formal logic and a matching a corresponding explanation of why it works that is more accessible.

NeilGirdhar commented 1 year ago

object cannot be considered a top type because types themselves are not a member of it

Are you sure? issubclass(type, object) is true.

object | type[object]

This is the second time this expression has shown up in this thread. I think this just reduces to object since issubclass(type[object], object) is true.

mikeshardmind commented 1 year ago

I could have sworn I double-checked that first... Nevermind, it should hold for object as well.

I need to double-check my work later on a few other things tangential to this, but I'm positive it would have no impact on the Any case, as that was done by hand (And using more abstract rules than some of the specific things I checked later) I seem to have misrepresented python's type in coq, so the 4 or so interesting things I checked after that need to be rechecked after I fix the definitions.

erictraut commented 1 year ago

I just thought of another argument in favor of not reducing intersections with Any and not ascribing any special meaning to Any in this context.

It turns out that Any is not unique. There are at least two other typing constructs in the Python type system that, like Any, do not follow the standard rules of subtyping. This includes Callable and tuple when used with an ellipsis: Callable[..., X] and tuple[X, ...]. These constructs, like Any, were added to support gradual typing, and they do not honor the rules of set theory.

If we create a special case for Any, then we need to consider similar special cases for these other constructs — along with any others that I may have forgotten or are added in the future. (Can anyone think of other type constructs in the Python type system that do not follow the standard rules of subtyping?)

mikeshardmind commented 1 year ago

If we create a special case for Any, then we need to consider similar special cases for these other constructs — along with any others that I may have forgotten or are added in the future. (Can anyone think of other type constructs in the Python type system that do not follow the standard rules of subtyping?)

Following the various discussions of how Any interacts, I'm no longer in favor of Any being reducible (in either direction)

Specifically, when it comes to attribute access on members of an intersection, the problem of Any being reducible rears in a way that I don't think it can reduce to either Any or T for Any & T

I believe the rule I have provided here correctly generalizes to the other special types you have shown there, can we remove all special cases as special by validating use of the resulting intersections under this rule?

If we can, are there other special types that do not follow this more general rule?

CarliJoy commented 1 year ago

Can we conclude -> Any is not reduced? As this created a huge discussion, can someone summarize in short an comprehensible words the decision and how we got here? There is always a section in the PEPs of Recjected Idea -> I think this belongs into it.

mikeshardmind commented 1 year ago

Can we conclude -> Any is not reduced? As this created a huge discussion, can someone summarize in short an comprehensible words the decision and how we got here? There is always a section in the PEPs of Recjected Idea -> I think this belongs into it.

Maybe. I'll make sure a summary happens in any outcome, I heavily appreciate PEPs that go through why things were done a specific way.

I'm still looking into formal correctness right now, and there's still an extremely strong argument to forbid Any from intersections all together which comes from group theory, but I need to be prepared to argue it pretty carefully with everything lined up because this would be saying that gradual typing can't participate in Intersections.


Edit: The same group theory says we shouldn't allow Any as defined in python in: Unions of types, Intersections of Types, any kind of acyclic topology or graph based on subtype relationships, and (ultimately, if you look enough) at all period. All existing theory says that Any as defined cannot exist in a well-defined type system. Since we clearly do allow any in some of these contexts, we need a formalization that allows it, special cases it, or which has rules that ignore the contradictions of Any.

For completeness' sake, a few approaches utilizing concepts from set theory as applied to types, homotopy type theory, and category theory all also had some logical inconsistency relating to Any, some of them showed up prior to Unions and Intersections.

Since the inconsistencies are so everywhere, I will continue what I have been doing, advocate for the most correct thing we can do given the circumstances, and continue forbidding the use of Any in code I own. We can still have a pragmatic definition for Intersection.

randolf-scholz commented 1 year ago

So in summary, there appear to be 3 different interpretations / mental models of Any:

  1. "Any=Top" — Any is the universal top type.
  2. "Any = Union[T1, T2, T3, ...]" — Any is the union of all types
  3. "Any = TypeVar('Any')" — Any is an ad-hoc variable of unknown type, assumed to be used compatibly.
① Is most definitively incompatible with the definition in the documentation. > ### The [Any](https://docs.python.org/3/library/typing.html#typing.Any) type > A special kind of type is [Any](https://docs.python.org/3/library/typing.html#typing.Any). A static type checker will treat every type as being compatible with [Any](https://docs.python.org/3/library/typing.html#typing.Any) **and [Any](https://docs.python.org/3/library/typing.html#typing.Any) as being compatible with every type.** The second sentence basically implies that `Any` cannot be the top type `Top`=`object`, since `x: int = object()` would be a type error.
② Is compatible with the definition and would logically imply that Any | X = Any and Any & X = X. **Proof:** [w.l.o.g.](https://en.wikipedia.org/wiki/Without_loss_of_generality) assume `T1`=`X`, then: 1. `X | Any = X | Union[T1, T2, …] = Union[X, X, T2, …] = Union[X, T2, …] = Any` 2. `X & Any = X & Union[T1, T2, …] = Union[X & X, X & T2, …] = X | Union[T2 & X, …] = X`
③ Is compatible with the definition* and would logically imply that Any | X and Any & X are irreducible forms. This has some crucial advantages over ②: - https://github.com/python/typing/issues/213#issuecomment-1646928419: imports from 3rd party libraries that alias to `Any` don't nuke your type hints. - https://github.com/CarliJoy/intersection_examples/issues/1#issuecomment-1649192200 - https://github.com/CarliJoy/intersection_examples/issues/1#issuecomment-1650884464

* This appears how Any gets actually often used in practice. ("this entity has some type, but I am too lazy / do not want to explicitly express it at the moment"), with the only caveat that Any is universal, whereas using actual ad-hoc variables could create impossible intersections and collapse to Never.

NeilGirdhar commented 1 year ago

② Is compatible with the definition and would logically imply that Any | X = Any and Any & X = X.

2 is not compatible with the documentation.

Most of the discussion in this thread was essentially created by the misinterpretation of Any as your 1 or 2, and I went over both of these in my comment in the section "Background: Definition of Any" in which I go over both cases.

"Any = TypeVar('Any')" — Any is an ad-hoc variable of unknown type, assumed to be used compatibly.

You can say this about any type:

int = TypeVar('int', bound=int)
A | B = TypeVar('X', bound=A | B)
Any = TypeVar('Any', bound=Any) = TypeVar('Any')

Although the type variable interpretation might be a good justification for conversion of a type toAny never creating type errors.

mikeshardmind commented 1 year ago

@randolf-scholz @NeilGirdhar

I brought up some additional information in the less formal discussion about some of this in discord, but hadn't typed it up for github yet.

It's provably impossible for Any to exist as documented in a coherent type system with transitive subtyping behavior.*

Here it is in 1 way (more of these are shown in the discord conversation)

Any :> T1 :> T2 ... :>TN :> Any ~ TN :> T1 results in an immediate contradiction if transitivity is a requirement for the subtyping relationship :>

We can also show that this is the current behavior of type checkers, matching the definition, and we cannot simply say the definition should not allow this without being highly disruptive to gradual typing.

def f(x: int) -> Any:
    return x

def g(x: Any) -> int:
    return x

I'll probably be writing a draft PEP to amend the definition of Any to have it clearly be stated as the exception, clearly note that it is not a type in the type system, but an escape hatch from it for the purposes of gradual typing, document its behavior extensively, and also recommend that all future typing PEPs have a section on how they interact with gradual typing, even if it is just a note that there are no special considerations for gradual typing relating to the PEP.

The only logically sound answers here are to exclude Any from intersections or accept that Any cannot be logically consistent, but that we can have how it behaves in an intersection documented.

Trying to claim that Any has fully logically consistent behavior in any form here will only lead to further disagreements based on the lens people are looking at Any from.


* Edit: To clarify, PEP 483 already says that Any doesn't follow this transitive behavior, it's not new. I'm just recommending that we keep in mind that Any is the exception here.

mikeshardmind commented 1 year ago

③ Is compatible with the definition* and would logically imply that Any | X and Any & X are irreducible forms.

Also brought up in the discord briefly. While this appears to be the "most consistent" approach, it's functionally problematic to do this without also defining "what is the expected structural interface available after something satisfies this type", and "what is the assignability of this type?"

The structural interface appears to be either equivalent to Any by definition of Any and compatibility, or if we are attempting to not discard the known constraint provided by T, There are possible definitions for (Any & T).foo which widen the compatible use of T.foo with the exception of attributes that are non-subclassable types, or any other types which cannot be expressed compatibly to T.foo.

The assignability though still appears to require satisfying T.

If Any is allowed in intersections, Then we really need clear handling for each bit of this, and it is going to be somewhat arbitrary due to the inconsistencies of Any, so we're basically picking for behavior that is most helpful to users.

And If we're acknowledging that this is somewhat picking for behavior arbitrarily...,

ippeiukai commented 1 year ago

I'll probably be writing a draft PEP to amend the definition of Any to have it clearly be stated as the exception, clearly note that it is not a type in the type system, but an escape hatch from it for the purposes of gradual typing, document its behavior extensively, and also recommend that all future typing PEPs have a section on how they interact with gradual typing, even if it is just a note that there are no special considerations for gradual typing relating to the PEP.

Following this issue in the context of intersection, I think this is a great idea. Any is so special that it deserves its own PEP. This thread is all about Any and became less and less about Intersection. Why not leave the precise treatment of Any out of the Intercection’s spec as much as possible? Perhaps only stating a few things that it really should or should not allow.

The PEP for Any’s definition can discuss how Any should interact with Union, Intersection, and other constructs freely. Type checkers can choose to be looser or stricter in terms of reducing Intersection with Any until the other PEP comes in, at which point treatment of Union with Any can be also looked at.

ippeiukai commented 1 year ago

By the way, in terms of gradual typing, I feel Intersection[T, Any] = T is awful.

some_fn(some_obj)

# ===

def wrap_obj(val: T) -> T & SomeAddedTrait:
  …

a = wrap_obj(some_obj)
some_fn(a)

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

randolf-scholz commented 1 year ago

I don't think disallowing intersections with Any is a good idea. They can easily occur naturally, most commonly when performing multiple isinstance checks like isinstance(x, A) and isinstance(x, B). If either A or B type-alias to Any, there you have it.

In particular, note that since python 3.11 one can use Any as a base class.

mikeshardmind commented 1 year ago

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

This is exactly the reason I think Any shouldn't even be allowed in an intersection. There is no outcome with it in it that wouldn't be surprising to someone, no matter how well it is documented.

Neither of these is good, and the only outcome that avoids both is banning Any from the intersection.

If you instead error and inform them of the ambiguity, there are a couple solutions available to the person the error is presented to, at least one of which should work.

There's also the matter here that banning Any from the intersection should not disrupt gradual typing.

In the first case, this should be a pretty clear case of "they probably don't know they are importing something aliased to Any, because an intersection with Any is lossy on type information no matter how we try to resolve it"

In the second case, by the necessity of how generics and type variables are scoped, the type var has to belong to the same codebase as the intersection, and the typed code user can set the bound of the typevar to object or to a protocol based on what they need from that side of the intersection (rather than to Any)

DiscordLiz commented 1 year ago

Here, let’s assume some_fn and some_obj are provided by two different libraries both currently untyped. What if some_fn got typed before some_obj gets typed? I expect I can do anything to a that I can do to some_obj. I don’t expect a to be reduced to mere SomeAddedTrait.

Do you really expect you can do anything to that, or do you then use the documentation or the source code to determine what is valid on what the library is providing? Typing is a safety feature. It can and does catch bugs, and code bases which have added typing have discovered inherently unsafe things they were doing in the process. Gradual typing should not be implemented in a way that compromises this.

Protocols already give you a way to interact with untyped code without having to accept Any directly. Something that is Any will pass the protocol check, but you're still specifying your expectations for your code.

And are you really suggesting that in a third library (yours) you have two other libraries that you expect people to subclass from to use your library instead of exposing what your library actually needs them to provide via protocols? Or at the very least providing a abstract base class to your users to inherit from rather than expect your users to own the internal implementation details of your library.

The arguments for not forbidding any are vapid from my perspective, and speak to larger design issues that should be fixed, not catered to. The lack of ergonomics created by such twisted arguments makes me wonder if people think typing has to be painful because they have boxed themselves into designs that are painful with or without typing.

randolf-scholz commented 1 year ago

@mikeshardmind Honestly, I feel this whole discussion is a bit about out of scope. The PEP should simply inform what the type-theoretically correct type to use in this situation. From this point, "banning" Any is no option, some type has to be returned! The closest option to "banning", whatever you actually mean by this word, would then be to return Never.

Leave it to type checkers to implement strictness flags to allow or disallow intersections with Any on their own.

DiscordLiz commented 1 year ago

And not only does Banning Any not hurt untyped or gradually typed libraries, it does the opposite and helps them. Banning Any from intersections is good for gradual type users. If users already need to write a protocol to get the benefits of intersections, you have yet another reason that if it's anywhere near as common as people think it is, they can get the people impacted (those writing intersections) to be aware that a library maintainer may need help with adopting typing and contribute upstream to help improve the whole ecosystem

mikeshardmind commented 1 year ago

@randolf-scholz

Honestly, I feel this whole discussion is a bit about out of scope. The PEP should simply inform what the type-theoretically correct type to use in this situation. From this point, "banning" Any is no option, some type has to be returned! The closest option to "banning", whatever you actually mean by this word, would then be to return Never.

Leave it to type checkers to implement strictness flags to allow or disallow intersections with Any on their own.

Do note that banning it would not be banning the use of untyped variables or the use of Any. It would only ban the use of Any as part of the intersection operands, something typed as Any would be compatible with the intersection anyway per the current definition of Any.

randolf-scholz commented 1 year ago

@mikeshardmind @DiscordLiz What do you 2 even mean by banning? Do you suggest that int & Any raises an error at runtime, like in your example of using a generic as a bound of TypeVar?

mikeshardmind commented 1 year ago

That's quite literally what banning it from the intersection would mean. It can't be there. That doesn't mean things that are Any can't be compatible, just that Any is problematic as part of specifying an intersection.

randolf-scholz commented 1 year ago

@mikeshardmind This would introduce several inconsistencies on its own, consider:

from typing import Any

class Foo(Any): ...

x = "test"
isinstance(x, Foo) and isinstance(x, int)  # does not raise error at runtime
isinstance(x, Foo & int)  # raises error at runtime??
mikeshardmind commented 1 year ago

If that works they way you are showing and it isn't because Foo is no longer Any, then...

I think it was a mistake to have Any be runtime checkable with isinstance (edit: it isn't, see below), everything which can be instantiated to be checked in this way is always Any anyhow, so I'm not exactly worried about that particular inconsistency, as nobody should ever need to check that. I'm more worried about inconsistencies that actually cause issues where something is detected as fine by a type checker, but cause errors at runtime, as this removes the primary reason to even have type checking.

If Any & Foo doesn't reduce or reduces to Foo, then suddenly you threw away type info of the requirement Foo, and should have just written Any if this was your intent.

mikeshardmind commented 1 year ago

And for the record, it doesn't work with Any itself on any currently released python version

from typing import Any
instance(1, Any)  # TypeError: typing.Any cannot be used with isinstance()

If your example works, it should only be because Foo is no longer Any. It is Foo

DiscordLiz commented 1 year ago

@randolf-scholz

@ mikeshardmind @ DiscordLiz What do you 2 even mean by banning? Do you suggest that int & Any raises an error at runtime, like in your example of using a generic as a bound of TypeVar?

What @ mikeshardmind clarified is fine for a definition of banning it.

You don't need it. The person with typed code can specify what they expect and still have things that are typed as Any work via Any's compatability definition.

# untyped code
x: Any = SomeLibraryType()

# typed code
def foo(param: SupportsAbs & SupportsIndex):
    # Here we can assume abs(param) is fine, we don't lose typing the body

foo(x)  # It's Any, it's compatible, we don't lose being able to use this.
randolf-scholz commented 1 year ago

@mikeshardmind

If Any & Foo doesn't reduce or reduces to Foo, then suddenly you threw away type info of the requirement Foo, and should have just written Any if this was your intent.

How has one thrown away any information if Any & Foo does not reduce?

@DiscordLiz Banning Any would greatly hurt gradual typing, because if some_library exports A, but A aliases to Any, I can no longer write any hinted code that has Intersections with A. Until recently, this was for example the case with the very popular pandas library. (and pandas-stubs is still far from complete). It is needed.

I still don't quite see the issue you have with keeping Any & X as an irreducible form. Maybe you can provide a definitive example why the irreducible approach does not work, and @CarliJoy can pin it to the top.

Ideally, the PEP should have an extended alternative approaches section with examples that explains well why certain decisions were made.

mikeshardmind commented 1 year ago

Banning Any would greatly hurt gradual typing, because if some_library exports A, but A aliases to Any, I can no longer write any hinted code that has Intersections with A. Until recently, this was for example the case with the very popular pandas library. It is needed.

you don't need an intersection with Any though. you don't gain any type information from this compared to Any without the intersection, and you can write a protocol based on your use of the library type if you instead want a basic "I'm using this how I expect it to be valid" check.

I still don't quite see the issue you have with keeping Any & X as an irreducible form.

All things are compatible with Any, including things that don't and can't exist or which would be incompatible with other operands. As an irreducible form, we reach the conclusion that for a concrete type T:

(Any & T).foo = Any.foo & T.too = Any & T.foo, so now any use of foo, even those that on the surface appear to conflict with T.foo must be considered valid by Any's compatibility guarantee. If instead, the person using an untyped library merely defines a protocol based on their use, it just works without effectively erasing T's type information.

tomasr8 commented 1 year ago

If we raise an error for Any & int, what do we do for e.g. (Any | int) & str? Since that distributes to (Any & str) | (int & str) so it's technically not allowed either?

mikeshardmind commented 1 year ago

Yes. It shouldn't be allowed there either if the approach gone with is to ban Any. I don't see a problem with this. In what case will someone be writing partially typed code and an intersection? The one where they didn't know about the preexisting tool of protocols for partially typing things, and typing them based on their known use.

randolf-scholz commented 1 year ago

you don't need an intersection with Any though. you don't gain any type information from this compared to Any without the intersection, and you can write a protocol based on your use of the library type if you instead want a basic "I'm using this how I expect it to be valid" check.

The point is that I can write code containing intersections now, and don't have to wait until typing-stubs for some_library become available. Once they do, I get immediate feedback for what works and what doesn't.

(Any & T).foo = Any.foo & T.too = Any & T.foo, so now any use of foo, even those that on the surface appear to conflict with T.foo must be considered valid by Any's compatibility guarantee.

What? Why? Any & T.foo is still irreducible and not equal to Any. Only things compatible with T.foo would be accepted.

If we define Any & X to be irreducible, it obviously has to apply to attributes as well. It seems like you are suddenly trying to apply the other proposed definition Any & X = Any to the attributes...

mikeshardmind commented 1 year ago

What? Why? Any & T.foo is still irreducible and not equal to Any. Only things compatible with T.foo would be accepted.

If that's the actual answer people reach on it, that's fine, but here's a quick counter example that complicates this issue because of how pervasive Any becomes the moment it is included

class X:

    def foo(x: int) -> int:
        ...

class Z:
    def foo(x: int | str) -> int:
        ...

Well, what if T.foo is the definition provided by X? There are "compatible" definitions (such as the one in Z) that Any could be compatible with that would be unexpected use.

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

tomasr8 commented 1 year ago

Yes. It shouldn't be allowed there either if the approach gone with is to ban Any. I don't see a problem with this. In what case will someone be writing partially typed code and an intersection? The one where they didn't know about the preexisting tool of protocols for partially typing things, and typing them based on their known use.

I don't disagree with that but this means that every time you use intersections or unions, python will have to recursively check the expression to make sure there is no accidental use of intersections with Any which seems potentially costly.

DiscordLiz commented 1 year ago

I'd rather be told when I write SomeLibraryType & MyType that there's something unexpected lurking there than python typing erase my expectations because a library exported something untyped. If I'm far enough involved in typing to be considering Intersections, It's unlikely that I want to have Any used in my code.

DiscordLiz commented 1 year ago

I don't disagree with that but this means that every time you use intersections or unions, python will have to recursively check the expression to make sure there is no accidental use of intersections with Any which seems potentially costly.

Only in the case of intersections, and only for the operands. distributivity means we don't have the check the unions until they appear in an intersection, and then it's a simple linear search through a concatenation of the operands.

NeilGirdhar commented 1 year ago

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

I think that is what you want in practice. Having Any there is supposed to broaden the interface. You should examine in which circumstance Any could actually be there.

randolf-scholz commented 1 year ago

@mikeshardmind I retract the sentence Only things compatible with T.foo would be accepted. that was obviously wrong. The question rather becomes how type-checkers shall treat irreducible forms Any & X.

Again, I think one should let type-checkers decide how strict they want to be, instead of outright banning it.

  1. I think we can agree that (Any & X).bar should always be fine if X does no supply bar
  2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.

But what's the need to set this in stone as part of the PEP? PEP484 didn't formally specify how to handle Union[Any, X] either.

mikeshardmind commented 1 year ago

But what's the need to set this in stone as part of the PEP? PEP484 didn't formally specify either how to handle Union[Any, X] either.

And type checkers vary on how they handle this case creating incompatibility between different type checkers. This should be rigidly defined if it is allowed, or banned if it is not. This is from the typing summit, a talk that went over this inconsistency between type checkers https://youtu.be/BNTkWQfqP_c?t=8465

If a library is written assuming one type checker, should other type checkers be incompatible? what happens if multiple dependencies are using incompatible rules for something which was both allowed and poorly defined?

NeilGirdhar commented 1 year ago

2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.

I don't think it can even be T.foo. All that's guaranteed is that the return type is covariant. The parameter specification is contravariant, so there's still a lot of flexibility with that.

mikeshardmind commented 1 year ago
  1. I think we can agree that (Any & X).bar should always be fine if X does no supply bar
  2. If X provides foo, then type-checkers could provide a strictness flag that (Any & T).foo aliases either to T.foo or Any.
  1. Yep. I came up with a set of rules here that led to me viewing Any & T irreducibly as structurally mostly equivalent to Any and thus re-raising alarm bells about surprising behavior for me. I would conceptually agree with that, and the rules provided match it.

  2. There would need to be a way for library authors to do this too, otherwise, you get the same compatibility issues / differing expectations. Suggesting such a flag to me points to the deeper problem here directly. If you need configuration on that level, the feature itself is inconsistent in what people will expect from it by including Any. I don't want someone using my library complaining something is broken because they changed a configuration setting and suddenly my expectations that I expressed don't hold for their use because they were changed out from under what I wrote.

mikeshardmind commented 1 year ago

Any & X makes X.foo(some_str) compatible use. This is what I mean by it having surprising effects for people expecting their code to be typed, as it is (With some limitations) very similar to just reducing to Any

I think that is what you want in practice. Having Any there is supposed to broaden the interface. You should examine in which circumstance Any could actually be there.

It isn't. If I wanted this, I have that right now by just typing Any. To what end would I ever write the intersection if that was what I wanted? I do not want to throw away interface details, and I largely assume that intersections only make sense between already compatible interfaces. The fact that Any clobbers every interface is just not useful in this context.

NeilGirdhar commented 1 year ago

It isn't. If I wanted this, I have that right now by just typing Any

In practice, Any & T can be created synthetically.

randolf-scholz commented 1 year ago

@mikeshardmind

If I wanted this, I have that right now by just typing Any. To what end would I ever write the intersection if that was what I wanted?

It happens if you import A from some_library and A happens to alias to Any. I already gave pandas as an example of a library which did this. They aliased Series and DataFrame to Any because they didn't have stubs yet, and these were very complex classes that would raise too many false-positives otherwise.

It still makes sense to type-hint code with these classes for several reasons:

  1. typing-stubs might become available in the future.
  2. documentation engines like sphinx read type hints and add them to documentation.
mikeshardmind commented 1 year ago

In practice, Any & T can be created synthetically.

Only from the side of the intersection author, not the untyped code as already mentioned above.

If I write a type var, I control the type var. If I include a type var in the intersection I wrote, it has to be my type var because of the scoping rules of type variables.

So this is still entirely the person who is writing the intersection creating this situation via a type var, or via direct use of an exported Any. (And this was already addressed above)

In both cases, we can provide a better option The error messages don't need to be cryptic, they can tell the person writing the intersection "See this for how to be compatible with gradual typed code without Any here"

DiscordLiz commented 1 year ago

It happens if you import A from some_library and A happens to alias to Any. I already gave pandas as an example of a library which did this. They aliased Series and DataFrame to Any because they didn't have stubs yet, and these were very complex classes that would raise too many false-positives otherwise.

I'd rather be told I imported an Any that could overwrite my expectations. I can handle it from there. with a protocol. I don't need the information I'm providing being discarded for the sake of people who already aren't benefiting from typing.

mikeshardmind commented 1 year ago

It still makes sense to type-hint code with these classes for several reasons:

  1. typing-stubs might become available in the future.
  2. documentation engines like sphinx read type hints and add them to documentation.

The protocol-based approach doesn't invalidate 1, it works with it. The protocol based approach also works fine with documentation such as sphinx(2.) . you can name your protocols things that make sense for your documentation and then have them themselves either documented for what they correspond to, or set up aliasing in your sphinx config. you can have a mapping of classes which redirect elsewhere rather easily.

Bringing this up is a total red-herring to whether Any has well defined behavior in an intersection that is both unsurprising and useful.

randolf-scholz commented 1 year ago

@DiscordLiz Well, then why can't we just have a strictness flag warn-import-any, and then you can write your replacement Protocol if you want to, but if I don't, then I won't be bothered. The Protocol might have to be extremely complicated, for example, the pands.Series stub currently is over 1500 lines and requires plently of auxiliary typevars and whatnot.

DiscordLiz commented 1 year ago

@DiscordLiz Well, then why can't we just have a strictness flag warn-import-any, and then you can write your replacement Protocol if you want to, but if I don't, then I won't be bothered. The Protocol might have to be extremely complicated, for example, the pands.Series stub currently is over 1500 lines and requires plently of auxiliary typevars and whatnot.

Even with such a flag, you need to define the behavior of Any & SomeType for the people using it. You haven't fixed the inconsistency, just given people a way to never have to see it and arguably create a bigger rift between typed and untyped code because now I can't import untyped code even if it is compatible with my typings. If you give me that flag as the only way to fix this, it won't result in me using intersection more, it will result in me using untyped code less as causing annoyances.

And with that need to define the behavior still there, do you think you have such a definition that doesn't still have other issues with inconsistency in people's expectations?

NeilGirdhar commented 1 year ago

Only from the side of the intersection author, not the untyped code as already mentioned above.

I don't know what you mean by this. The intersection can be in library code that accepts types from the user, or the library can provide types that the user intersects. Two libraries can even interact with each other, and synthesize an intersection that the user uses.

So, no, you can't just say forget about intersections with Any. They can happen for all kinds of reasons outside of user control.

And returning an error is not reasonable here. In part because silencing the error could be extremely problematic. The error has to be associated with a line number for the type: ignore to be on.