Closed gvanrossum closed 8 years ago
This is a bit sad. People would want to do some interactive experimentation to see what is issubclass(int, Union[int, str])
, but probably there are some good reasons behind this decision.
Maybe it is a good thing, at least for 3.5. This will effectively force type objects to be used only in annotations, which gives us more room to tweak their behaviour before 3.6.
Maybe one could introduce a helper function acceptable(obj, hint)
in typing.py that tells whether obj
is OK for type hint
. Like acceptable(42, Union[int, str]) == True
.
Here's what Mark wrote about the subject:
My remaining concern is the isinstance and issubclass support.
There is a real different difference between whether a value is a member of a type and whether it is an instance of a class.
Support for isinstance and issubclass applied to types just breeds confusion.
As you said, for most users, the distinction between types and classes is subtle. But I think it is important to make people aware that there is a distinction, even if they fully understand it.
For example,
List[int] and List[str] and mutually incompatible types, yet
isinstance([], List[int]) and isinstance([], List[str))
both return true.
Just delete all the __instancecheck__ and __subclasscheck__ methods and I'll be happy.
Maybe one could introduce a helper function acceptable(obj, hint) in typing.py that tells whether obj is OK for type hint. Like acceptable(42, Union[int, str]) == True.
I think that we should leave the task of validating types to external tools (and compilers), not user code. So I'd be -1 on this.
The equivalent of issubclass()
should probably be called is_consistent_with()
(or some variation of that) to match the terminology used by Jeremy Siek in his Gradual Typing blog post (http://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typing/). I'm not sure what to call the isinstance()
equivalent, perhaps is_acceptable_for()
. But I plan to ask others to contribute such functions.
The intended use would be not to validate types in code but just do some quick interactive tests in REPL. But OK, maybe you are right people anyway will start using it for validation.
@gvanrossum, Indeed, to have two such functions would be very cool. Even their presence would underline the distinction between the classes and types.
If we're drawing this type/class distinction, PEP 483 should be updated:
We use the terms type and class interchangeably, and we assume type(x) is x.class.
Should I report that as a separate bug?
Here is fine. I've made some updates to PEP 483; it really does use the two words interchangeably, and I don't want to have to update the text too much, but I elaborated that particular sentence (and removed the thing about class, which is irrelevant).
I'm confused -- as Python 3.5 is now in RC, does this mean the typing types were really left with only __subclasscheck__
working and not __isinstance__
? I had a working runtime type checker (https://gist.github.com/agronholm/00311c169527442ddd5b) which relied on isinstance() but it broke at some point when __instancecheck__
was disabled. Should I not rely on even __subclasscheck__
then?
You should not rely on subclass checking either. I ran out of time, but PEP 484 is accepted provisionally, which means we can still change the API in later 3.5.x releases or in 3.6.
Ok, so with both __instancecheck__
and __subclasscheck__
out of commission, how am I to implement a runtime type checker? I would need to be able to somehow detect that a type, say Dict[str, int] is derived from plain Dict, but I can't find any way to do that since issubclass(Dict[str, int], Dict) returns False. Likewise, I'd like to make use of the __extra__
argument to automatically check against the appropriate ABC when one is available, but I'm hesistant to just do getattr(obj, '__extra__
', None) since __extra__
could be anything outside the typing module.
The idea is that you would have to write your own functions similar to isinstance() and issubclass() to introspect the objects created at run-time by type annotations. You are right however that the introspection interface has not yet been specified -- I think it would have to be a separate PEP. I would not rely on the __extra__
attribute unless you're willing to see it disappear at any point in the future -- possibly as soon 3.5.1. (However, the reasoning about __extra__
possibly being something else is somewhat faulty -- non-stdlib code should never use dunder names for anything other than their documented meaning, as the stdlib can at any point define a new dunder name without any concern for existing occurrences of that dunder name in user code. So non-stdlib code that uses __extra__
is already broken.)
Alright, so I guess this means runtime type checking has to wait until that introspection interface has been specified. FWIW, here's what I had before I ran into this problem: https://gist.github.com/agronholm/b06bdba54c73d9b2f1f1
I do encourage you to continue along this path, assuming the introspection API won't be terribly different from what is currently implemented. (In fact I think extra is probably the only thing that might significantly change, as it isn't very principled.) Instead of issubclass() and isinstance() you'll have to code your own, but looking at parameters etc. sounds about right. Have you thought about what to do about requiring that type variables conform across parameters and return values yet? E.g. for def find(x: T, xs: Iterable[T]) -> int: ...
; you should allow find(42, [1, 2, 3])
and even find(0, [])
but you should reject find(42, ['a', 'b', 'c'])
and find(0, [1, 2, 'x'])
.
I had thought about type variable checking but I hadn't gotten that far yet. The problem I was stuck with is that if I encounter a type hint like List[int]
, how am I supposed to programmatically determine that it should be treated as a List? The original class (plain List
) does not show up on the MRO so I can't use isinstance() or issubclass() to determine the necessary checking semantics. The only solution I could find was using issubclass() on the undocumented __origin__
attribute, but I am again hesitant to use something that is undocumented.
The best I can recommend is using __origin__
-- if we were to change this attribute there would still have to be some other way to access the same information, and it would be easy to grep your code for occurrences of __origin__
. (I'd be less worried about changes to __origin__
than to __extra__
.) You may also look at the internal functions _gorg()
and _geqv()
(these names will not be part of any public API, obviously, but their implementations are very simple and conceptually useful).
I'll look at them -- thanks!
OK, delivered: https://pypi.python.org/pypi/typeguard
See also http://bugs.python.org/issue26075 (Union[str, Iterable[int]] returns Iterable[int], which is wrong!)
I have just uploaded the new version of my runtime type checker, typecheck-decorator
1.3, to PyPI, now with 'typing' support:
https://pypi.python.org/pypi/typecheck-decorator/
Compared to typeguard 1.1.1, it has broader support (except for Callable, which is yet missing) and a more precise documentation of its semantics and limitations.
I struggled enormously with the semantics of TypeVar
s, because there are several strange issues in the current typing.py. For instance, tg.ItemsView
has three generic parameters (tg.T_co, tg.KT, tg.VT_co)
, the first of which represents the tg.Tuple[tg.KT, tg.VT_co]
returned by each call to the iterator -- but how, in general (that is, for user-defined types), is a poor type checker to know this?
Should I report this?
(I found other bugs/oddities, but those have to do with __subclasscheck__
...)
Hm, I think this is a bug in the definition of ItemsView in typing.py. I filed https://github.com/python/typing/issues/177 to track this.
So, I've been looking at the papers on gradual typing, and it looks like the is-consistent-with relation is simply the gradual typing version of type equality. That is, X
is consistent with Y
iff X
is Y
or either is Any
.
The gradual typing counterpart to issubclass
would be something like is-consistent-subtype-of. The idea is formalized in Siek & Taha's Gradual Typing for Objects, pp. 6–8 (Section 4).
I'm not opposed to flexible use of terminology, but as it stands, the PEP 483 definition of is-consistent-with is highly confusing, since it actually describes is-consistent-subtype-of.
Anyway, drawing this distinction suggests a way to preserve issubclass
: just take out Any
. Then, the issubclass
operation for types would simply mean is-subtype-of, which is hardly controversial usage.
The gradual typing companions to issubclass
would look roughly like this:
def is_consistent_with(t1, t2):
return (t1 is t2 or
t1 is Any or
t2 is Any)
def is_consistent_subtype(t1, t2):
return is_consistent_with(t1, t2) or issubclass(t1, t2)
Now, none of this addresses the criticism that
List[int]
andList[str]
and mutually incompatible types, yetisinstance([], List[int])
andisinstance([], List[str))
both return true.
This feels more like a bug in the implementation of __subclasscheck__
than a fundamental problem with the concept of subtypes. Why should list
be considered a subclass of a parameterized List
when the same is not true in general?
>>> class MyGeneric(Iterable[T]): pass
...
>>> m = MyGeneric()
>>> isinstance(m, MyGeneric)
True
>>> isinstance(m, MyGeneric[int])
False
Code that is interested in type-checking [1, 2, 3]
against List[int]
should pretty clearly be comparing type([1, 2, 3])
against the declared type's ultimate origin, not against a type-constrained version.
EDIT: This argument essentially suggests taking the patch proposed in #202 a step further and ignoring __extra__
for constrained generics. IOW, issubclass(list, typing.Sequence[Any])
would delegate to collections.abc.Sequence
but issubclass(list, typing.Sequence[int])
would not.
the gradual typing counterpart to issubclass would be something like is-consistent-subtype-of.
As you might have guessed I'm not really up to reading academic literature (neither are most Python devs). I got the definition of is-consistent-with from Siek's blog post (http://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typing/) -- I guess the problem with the blog post is that it doesn't actually address subtyping. I'm happy to update PEP 483, nobody seems to read it anyway.
isinstance(m, MyGeneric)
You're reporting a bug in isinstance(), which is not supposed to be supported at all! The bug is that it doesn't consistently raise an exception.
but issubclass(list, typing.Sequence[int]) would not.
What should it return? Isn't this just another argument that issubclass() should also be disabled?
Things would go quickest if you could suggest a patch for PEP 483. The repo is here: https://hg.python.org/peps/ (Don't use any of the github clones you may find, they are likely out of date.)
What should it return? Isn't this just another argument that issubclass() should also be disabled?
It's possible I'm badly misunderstanding bintoro, but it sounds like they're talking about tracing the variable (rather than the value) back to where it was declared/returned/otherwise-imbued-with-a-type and looking at its compile-time type to determine the result of issubclass(). But that sounds totally infeasible to me.
@bintoro Please correct me since I've probably misconstrued your words.
You're reporting a bug in isinstance(), which is not supposed to be supported at all! The bug is that it doesn't consistently raise an exception.
I was under the impression that instance checks would be preserved for generics. If a custom generic is to be used like any other class, then presumably instance checks should work too. Otherwise applications would suddenly break when typing information is added to containers that are subject to an instance check somewhere.
IMO it would be best to delete __instancecheck__
and __subclasscheck__
from GenericMeta
and let the standard implementation decide. Then everything would work normally: if it's in the MRO, it's a superclass.
I experimented with removing GenericMeta.__subclasscheck__
and inserting extras as actual base classes. I quite like the result. Take a container like this:
class MyMapping(collections.MutableMapping): ...
Add type information:
class MyMapping(typing.MutableMapping[K, V]): ...
Now, the following continues to work:
>>> m = MyMapping()
>>> isinstance(m, collections.MutableMapping)
True
This seems ideal. In the current scheme a typed MyMapping
would have to inherit from both MutableMapping
variants in order to be universally recognized as such.
An open question is whether it's desirable for issubclass(list, typing.Sequence)
to continue to return true. Perhaps I haven't thought this through, but I don't readily see why this should be so. The typing
collections are only intended as generic base classes, not as all-around replacements for their collections.abc
counterparts.
That said, an easy way to support such subclass checks is to link the ABC registries. Then, whatever is registered on, say, collections.MutableSequence
would be found through typing.MutableSequence
as well.
EDIT: Code is here.
but
issubclass(list, typing.Sequence[int])
would not.What should it return? Isn't this just another argument that issubclass() should also be disabled?
The way I see it, it should return false. Sequence[int]
is just a subclass of Sequence
, one that has nothing to do with list
.
@NYKevin That's not what I'm suggesting at all. I'm suggesting to remove complexity, not add to it. Whether C[Y]
is a subclass of C[X]
should only depend on whether the former is derived from the latter, not on the relationship between Y
and X
. That sort of analysis is useful, of course, but should probably live outside __subclasscheck__
because __subclasscheck__
is called at runtime and runtime types are not parameterized.
Things would go quickest if you could suggest a patch for PEP 483.
Will do, as soon as I figure out what to call the relation. I suppose that when subtyping is introduced, the subtype-augmented is-consistent-with actually replaces the original definition, so we can just call it that.
All that is needed, really, is to mention the subtyping issue, in order to explain why the blog post describes a very different set of rules than PEP 483. (That is the confusion that led me to the research papers.)
In general, I like the direction where #207 goes but I think it is better to discuss the "specification", before one goes for an implementation. Here is my view of the situation:
1) In the context of this PEP we want to distinguish classes and types. Class of an object is simply obj.__class__
, while the type of an object is a concept that is introduced by a typecheker, linter, IDE, etc. typing.py
provides a standard API for such tools. Type t1
is called a subtype of t2
if t1
is acceptable where type t2
is expected. Since Python is very dynamic, we need a special type Any
. It is acceptable everywhere, and a variable declared Any
accepts all other types.
2) All special type constructs:
TypeVar
NewType
Generic
Union
and Optional
Intersection
(I hope it will be added at some point)
Type
Any
raise TypeError
when appear in isinstance
of issubclass
. Only first two could be instantiated.
3) issubclass
and isinstance
are only allowed for generic types if they are used without parameters assert issubclass(list, Sequence)
, assert isinstance(lambda x: x, Callable)
. Since by agreement List
without parameters means List[Any]
and Any
is a fallback to dynamic typing, this is consistent.
4) typing.py
provides a helper function is_compatible(one_type, other_type)
, that returns True
if one_type
should be acceptable where other_type
is expected. Currently this could be based on nominal subtyping, using declared variance of generic types, and semantics of special type constructs. Examples:
assert is_compatible(Any, Employee)
assert is_compatible(Manager, Employee)
assert is_compatible(Employee, Union[Employee, Stranger])
assert is_compatible(Tuple[Manager, Salary], Tuple[Employee, Salary])
assert is_compatible(Callable[[Employee], Salary], Callable[[Manager], Salary])
@gvanrossum , @markshannon do you agree with such formulations?
issubclass
andisinstance
are only allowed for generic types if they are used without parameters
If you're describing #207, then that's not really accurate. issubclass
and isinstance
simply have their normal semantics when used with generics.
Consider:
class A(Generic[T]): ...
class B(A[int]): ...
>>> isinstance(B(), A[int])
True
This is possible because A[int]
is B
's superclass in a very real way.
If this is considered undesirable, then I would correct it by dropping the parameterized intermediate type from the MRO rather than messing with issubclass
behavior. That is, instead of
>>> B.__mro__
(B[int], A[int], A[~T], typing.Generic[~T], typing.Generic, <class 'object'>)
we'd have
>>> B.__mro__
(B[int], A[~T], typing.Generic[~T], typing.Generic, <class 'object'>)
There are two points why one might want to limit issubclass()
only to generics without parameters.
List[int]
) in runtime context, only classes (like B
or A
in your example) should be used in runtime context.issubclass
, etc. (as I understand, this will also work in your current implementation).I think that your idea of removing A[int]
from MRO is not OK. In such case issubclass
will return False
, but I would like it to raise TypeError
. But I agree with you that we should reduce messing with __subclasschek__
to absolute minimum. Maybe we don't need this restriction.
In any case, do not hurry to make changes to your PR. We first need to get approval of the "specifications" from the guys I mentioned in my previous post.
By the way, I noticed a problem with the representation of B
, it is not generic, so that B[int]
has no sense. If such behavior is not caused by your changes, then I recommend you to open a separate issue on this.
My thinking goes like this: if A[int]
is considered enough of a class to appear in the MRO, then issubclass(B, A[int])
should be true; anything else would be just too weird. I mean, is there any precedent for issubclass(C, random.choice(C.__mro__))
occasionally raising a TypeError
?
Another way to look at it would be to consider the A[int]
specialization irrelevant for runtime behavior and insert only A
in the MRO. The full parameterization information can still be preserved internally for type-checking tools to investigate.
It's either a bona fide class or a magic type — you can't have it both ways.
By the way, I noticed a problem with the representation of
B
, it is not generic, so thatB[int]
has no sense.
My bad, I was using the wrong version. This is how generics have been represented since 3.5.1: the parameters are propagated indefinitely.
First, and very important note: Clearly you are working with outdated version of typing.py
. Many crucial bugs have been fixed since 3.5.1 and the whole generics system has been thoroughly reworked by @gvanrossum . If I have
class A(Generic[T]):
...
class B(A[int]):
...
then with the last version at this github repo I get:
>>> str(B)
'__main__.B'
>>> A[int]().__class__ is A
True
>>> B.__mro__
(__main__.B, __main__.A<~T>[int], __main__.A<~T>, __main__.Generic<~T>, __main__.Generic, <class 'object'>)
Now, concerning the question about issubclass
, as you can see there is an additional reason (apart from two mentioned in my previous post) for limiting it only to generics without parameters: Type erasure:
>>> A[int]().__class__ is A
True
It is explained in the PEP lines 504-508 (again, the latest version of the PEP is always here, in this github repo. In case you didn't know, this PEP is still in the provisional status, i.e., it still could be edited).
Let me explain why I want TypeError
instead of False
. Normally, isinstance(1, Union)
would just return False
, but we want to clearly say "you are doing something completely wrong", therefore we raise TypeError
. The same situation is here, by raising TypeError
we tell the user: If you are asking questions like isinstance([], List[int])
you do not understand what is going on.
Clearly you are working with outdated version of
typing.py
.
Looks like I was using 3.5.1 by mistake when I wrote the preceding comments. I'm perfectly aware of the recent changes, as #207 is based on them.
[...] as you can see there is an additional reason [...] for limiting it only to generics without parameters: Type erasure
Exactly. Since A[int]
reduces to A
at runtime, why should it appear as a distinct entity in the MRO?
The same situation is here, by raising
TypeError
we tell the user: If you are asking questions likeisinstance([], List[int])
you do not understand what is going on.
Generics are supposed to be able to be used as normal classes. It seems rather precarious to introduce something in the MRO that cannot be subclass-checked against. Union
is wholly different because it is not a concrete class at all and will never appear as a runtime type.
Just to clarify, I am not against excluding A[int]
form MRO, I am in favour of this idea. But, in addition to this (not in alternative to this) I would raise TypeError
instead of returrning False
.
@ilevkivskyi It seems we've been having a slight miscommunication here. I am totally fine with raising TypeError
in nonsensical cases as long as all(map(partial(issubclass, G), G.__mro__))
is true for every generic G
.
OK, it looks like in fact we have an agreement here. Now we need to wait for @gvanrossum and @markshannon , and their comments. At least one of them is quite busy now, because two releases of mypy are planned before May 26.
I still want __subclasscheck__
dropped.
To reiterate my position:
Determining whether a class is a subclass of a type is meaningless as far I'm concerned.
It is meaningful to check whether a type is a subtype of a type, but that is the job of a static checker and doesn't belong in the typing module.
Determining whether a class is a subclass of a type is meaningless as far I'm concerned.
@markshannon , we both agree with this. The question is whether simply remove __subclasscheck__
method (in this case issubclass(B, A[int])
will return True
following the normal mechanism), or have a __subclasscheck__
that raises TypeError
in all meaningless cases.
What does it even mean to "drop __subclasscheck__
"? Not implement it or have it error out? The former is exactly how it's done in #207, while the latter would render generics unusable.
Just to be clear, nobody's suggesting to support __subclasscheck__
for Union
or Any
. The discussion is about generics.
Why would making it error out render generics unusable?
@gvanrossum , maybe "unusable" is to radical formulation from @bintoro . Consider backward compatibility, people often use issubclass
. I could easily imagine something like assert issubclass(UserClass, FrameworkBase)
all over the code. If someone now decides to add a generic information to FrameworkBase
, like class FrameworkBase(Iterator[Task]): ...
, then all these subclass checks will fail, so that all the code will be broken, until the logic is completely changed to avoid issubclass
. This is against the idea of gradual typing: There should be a possibility to introduce it gradually :-)
We have Any
for such purpose, so that I think that a reasonable compromise would be to allow issubclass
for generics without parameters (remeber MyGeneric
without parameters is MyGeneric[Any]
) and raise TypeError
in all other cases. In such way all existing subclass checks will work, and if people will try to change them to MyGeneric[int]
they will get a note that they should not do this, because issubclass
is not intended for work with types.
Yes, that middle ground seems reasonable.
Yeah, that's my concern.
Essentially, adding typing information to a container class would require the author to be sure that the class will never take part in an instance check anywhere. This seems like a pretty serious (and pointless!) restriction.
I don't see the theoretical grounds, either. The rationale for disallowing __subclasscheck__
for the magic types Union
& co. is that they are not meaningful as runtime classes. As long as Generic
is intended as a means of adding typing information into real, instantiable classes (not only stubs), it's clear that the same reasoning does not apply. Custom generics are fundamentally just regular classes with typing metadata tacked on.
There's also the practical issue of supporting isinstance(x, typing.Iterable)
as a substitute for isinstance(x, collections.abc.Iterable)
etc.
As discussed earlier today, raising TypeError
from isinstance(x, MyGeneric[int])
makes sense, but in that case I feel that such types shouldn't be left in the MRO, either. If something's too magical to support instance checks, they should be eliminated from the MRO when the concrete class is constructed.
They're in the MRO so they can be introspected by other tools than isinstance/issubclass.
This is a good point, typecheckers could use MRO for their purpose. Also probably it is easier to implement, no need to manually construct MRO, just modify __subclasscheck__
and __instancecheck__
to raise TypeError
in the cases discussed.
If @gvanrossum has no other objections, I think it would be great if @bintoro will make a clean PR implementing the changes discussed above. This would close many issues, but also it is quite extensive surgery, so be careful :-)
They're in the MRO so they can be introspected by other tools than isinstance/issubclass.
Yes, but that information can just as well be preserved internally like __origin__
and friends.
I used this example earlier:
class A(Generic[T]):
...
class B(A[int]):
...
and suggested that A[int]
should collapse to just A
in B.__mro__
. The argument is twofold:
From a practical POV, it's going to be very unexpected to have something in the MRO that blows up when it appears in a subclass check.
From a purity POV, this whole debate is rooted in the distinction between types and classes, and this is exactly it. A[int]
is a type, not a class, so what's it doing in the MRO of a concrete class? The runtime base is just A
.
(None of this is an important issue to me, but it does seem like a logical conclusion of what's happening with __subclasscheck__
.)
I think I'll wait until 3.5.2 is released to get back to this.
@ilevkivskyi It would be pretty easy to incorporate this in #207, I think.
I'll add one more note here. Assuming parameterized generics now become types (non-classes), then something that needs discussion is what to do about instantiating them.
There are examples of this in the PEP:
IntNode = Node[int]
StrNode = Node[str]
p = IntNode()
q = StrNode()
If Node[int]
isn't a class, and instance checking against it is deemed nonsensical, then instantiating it should presumably be an error as well. Subclassing becomes a required step:
class IntNode(Node[int]):
pass
p = IntNode()
Those examples should work.
However, even though this would be equivalent at run time:
p = Node[int]()
that's disallowed by the PEP and indeed mypy reports it as an error (but allows the examples from the PEP).
That's all intentional, and the implementation somehow has to deal with it.
Mark Shannon wants me to drop
__subclasscheck__
from all type objects (Any, Union etc.). This is somewhat major surgery since it is used by Union simplification and for various type checks. See also #133 and #135.