Open miedzinski opened 7 years ago
This was on my radar early in mypy development. I didn't work on this back then because this is a tricky feature. For example, if module x
has A.register(B)
, it can affect a seemingly unrelated module y
if it tries to use B
as a subclass of A
. This is 'non-modular' behavior, and mypy would somehow have to figure out that y
depends on x
even if it doesn't import it, even indirectly. Currently dependencies between modules are completely described by the import graph.
There are few potential ways to support this:
A.register(B)
invalidate all modules that have been already type checked and that may use A
and B
. This can be tricky to implement and could slow down things, as we may have to process modules multiple times. Conservatively, A.register(B)
would invalidate every module that uses A
or B
or a subclass of B
.A.register(B)
visible/active only in modules that (recursively) import the module that has this declaration. Mypy would have to keep track of these register calls defined in each module and which of them are active in any given module. This would deviate from Python semantics and could be a little confusing. However, there would no 'spooky action at a distance'.A.register(B)
declarations in the entire program. This would get run before type checking. We can probably make this reasonably fast in incremental mode by being clever with caching etc. [This would probably result in mypy having to keep the entire program in memory when doing a non-incremental run, which would be unfortunate. Currently we could strip most nodes in ASTs of already processed modules to lower RAM use.]It seems likely that mypy will support structural subtyping before it supports registering ABCs. Structural subtyping/protocols can be used for a similar effect, but that should be easier to implement. See https://github.com/python/typing/issues/11 for more information.
Alternatively, instead of doing something very complicated to make sure A.register(B)
works everywhere, just make A.register(B)
a type error in modules that don't define either A or B, because you really shouldn't be doing that. (And it's pretty common to disallow such things, see for example Rust's rules against orphaned impls.) Solution 3 also seems potentially reasonable. Spooky action at a distance rules out all the other potential solutions IMO.
I just ran into a use case where support for A.register(B)
would be the only way to make things work:
from typing import *
class A('Iterable[A]'):
_a: 'List[A]'
def __iter__(self) -> 'Iterator[A]':
return iter(self._a)
a = A()
reveal_type(a) # Revealed type is 'test.A'
for x in a:
reveal_type(x) # Revealed type is 'test.A*'
mypy
is happy with this - surprisingly given that we don't support recursive types yet (#731). Of course, python runtime isn't too happy that we inherit from a string literal.
The basic problem is that we hacked the class inheritance syntax to do typing-related stuff. But this means we are limited in what we can do - and in particular, we can't use forward declarations there (as well as other constructs that would mess up inheritance; at present, I'm not aware of any specific situations where this comes up).
Since forwarding syntax is already set in stone, the only solution I see is to make this work:
class A:
_a: 'List[A]'
def __iter__(self) -> 'Iterator[A]':
return iter(self._a)
Iterable[A].register(A)```
It looks a little ugly, but at least it's logical.
True, this particular use case will disappear once Protocols (#3132) is merged. But it won't fully solve the problem because sometimes the user might want to rely on nominal rather than structural typing.
Incidentally, this situation already appears in builtins.pyi
:
class str(Sequence[str]):
# ...
It works there because it's a stub, not the original source code, so python runtime doesn't need to look at it.
(Btw, we should probably make mypy catch inheritance from a string literal before the runtime complains.)
@pkch
class A(Iterable['A']):
...
works both in mypy and at runtime.
Btw, what are the cases where you think people will want nominal checks instead of protocols?
Duh 😆
Nominal checks: if the method naming is too generic (validate
, save
, etc.) so the user has (or is afraid to have) a bunch of unrelated classes that would look compatible based on purely structural checks.
Any news/updates on this? I would have really liked support for .register()
in https://github.com/mesonbuild/meson/pull/8885.
@ilevkivskyi
Btw, what are the cases where you think people will want nominal checks instead of protocols?
Numeric tower inter-operating with the builtin types requires nominal typing and requires this functionality. People wanting to create nominal subtypes that do not want inherit their parent's implementation also requires this functionality. I have a need for the second reason.
class Logic:
"""4-value logic type as seen in Verilog and VHDL: 0, 1, X, and Z"""
class Bit(Logic):
"""2-value logic type: 0 and 1"""
class LogicArray(Sequence[Logic]):
"""Array of Logics that constructs literals or subtypes to enforce the invariant"""
class Sfixed(Sequence[Bit]):
"""Two's complement arbitrarily-bounded fixed point number as seen in VHDL"""
# can't inherit from LogicArray since the generic base class needs to be re-parameterized
# don't want to anyway since Sfixed is implemented with an integer since that's a whole lot faster than a list of user-defined types
LogicArray.register(Sfixed)
Of course the real issue here is that inheritance is a sucky mechanism since it violates separation of concerns. I'd much rather say "Sfixed
is a subtype of LogicArray
without implementation inheritance". Even if that concern is addressed ABC.register
would still be necessary since there is occasionally a need to add supertypes post-hoc. But of course ABC.register
is a bad mechanism to achieve that since it totally destroys modularity (similarly to what is stated here https://github.com/python/mypy/issues/11727#issuecomment-1086662376).
Most of the use cases I'm seeing here are for cases where people are using ABCMeta.register
on classes they control instead of trying to register externally-defined classes (usually because they want to register a class with an ABC without implementing everything). It would probably be easier to support that case than handling full support for ABCMeta.register
.
Special casing uses of register
as a class decorator would most likely be the easiest to implement since that automatically ensures that a class is registered as soon as it's defined and we already have some code for special casing decorators. We'd then probably:
Special casing uses of register as a class decorator would most likely be the easiest to implement
Isn't that the least important case since anyone who wants to support this can simply make the ABC a base class?
The biggest use case, I think, for this feature would be the numeric tower working as intended.
Isn't that the least important case since anyone who wants to support this can simply make the ABC a base class?
The important difference between using register
and making the ABC a base class is when an error about the class not properly implementing all abstract methods gets emitted. If the ABC is a base class, then that happens when the class is actually instantiated, instead of when you initially subclass the ABC, because mypy has to assume that you might be creating an ABC which would implement those methods later in a subclass. However, if you use register
, we can emit the error at that register
call itself because the registered class doesn't become an ABC, which means that there's no real reason to register a class with an ABC and then later implement any unimplemented attributes instead of just registering the final class when it has all the methods implemented.
The biggest use case, I think, for this feature would be the numeric tower working as intended.
My understanding of #3186 is that the builtin number types don't properly implement the numeric tower types that they claim to implement (like in https://github.com/python/typeshed/issues/3195#issuecomment-540946000), and runtime works around that by registering the classes with the correct ABCs.
In that context, having those builtin number types directly subclass the relevant ABCs isn't a realistic option because that would most likely cause errors about unimplemented methods everywhere that those number types are used, even though essentially all of that code is safe.
Using register
in the way that I mentioned, however, would be an option because we'd emit an error only once when the class was first registered with the ABC. That means that typeshed could register each builtin number type with the correct ABC by adding register
as a decorator, type: ignore
the error generated, and anyone who wants to use those builtin number types use them without getting an error.
the builtin number types don't properly implement the numeric tower types that they claim to implement (like in https://github.com/python/typeshed/issues/3195#issuecomment-540946000),
Is this still true? float.__ceil__
exists in Python 3.10. Are there any other examples?
t there's no real reason to register a class with an ABC and then later implement any unimplemented attributes instead of just registering the final class when it has all the methods implemented.
Are you sure about this? You maybe defining a class hierarchy and want to define an ABC, which you will then have your final classes inherit from it.
Using register in the way that I mentioned, however, would be an option because we'd emit an error only once when the class was first registered with the ABC.
So you want to change the semantics of register
so that it does the checks that happen at class instantiation time? Won't that break anyone's code who's using register to define ABCs?
@ktbarrett
Of course the real issue here is that inheritance is a sucky mechanism since it violates separation of concerns. I'd much rather say "Sfixed is a subtype of LogicArray without implementation inheritance".
I think you can just add in the appropriate interface class, and your example will work:
class Logic:
"""4-value logic type as seen in Verilog and VHDL: 0, 1, X, and Z"""
class Bit(Logic):
"""2-value logic type: 0 and 1"""
class LogicArrayInterface(Sequence[Logic]): ... # define any members you want to share.
class LogicArray(LogicArrayInterface):
"""Array of Logics that constructs literals or subtypes to enforce the invariant"""
class Sfixed(LogicArrayInterface, Sequence[Bit]):
"""Two's complement arbitrarily-bounded fixed point number as seen in VHDL"""
No need for ABC registration. Ordinary inheritance works and the concerns are perfectly separated.
@NeilGirdhar That's not equivalent. Sfixed is a subtype of LogicArray by its nature. Not only is it intuitively not correct, it will create more work.
For example, lets say you have a bitwise OR operation that should work on all Sequence[Logic]
types. The example above shows only two classes, in reality there are 10. What's the resulting type of a bitwise OR operation between a LogicArray and an Sfixed? Intuitively, it would be the less derived type, essentially doing implicit upcasting on the more derived type, so it would return a LogicArray. How would you write that?
class LogicArray(LogicArrayInterface):
"""Array of Logics that constructs literals or subtypes to enforce the invariant"""
# I don't know the concrete type of the other argument so I don't know the correct type to return.
def __or__(self, other: LogicArrayInterface) -> ???:
...
# This would work, but it requires that Sfixed be a subtype of LogicArray, that is no longer the case.
# I can get this to work at runtime using `ABC.register`, but it isn't supported statically.
def __or__(self: Self, other: Self) -> Self:
...
def __ror__(self: Self, other: Self) -> Self:
...
# Oh dear... this does work though.
@overload
def __or__(self, other: LogicArray) -> LogicArray: ...
@overload
def __or__(self, other: X01ZArray) -> LogicArray: ...
@overload
def __or__(self, other: BitArray) -> LogicArray: ...
@overload
def __or__(self, other: Signed) -> LogicArray: ...
@overload
def __or__(self, other: Unsigned) -> LogicArray: ...
@overload
def __or__(self, other: Sfixed) -> LogicArray: ...
@overload
def __or__(self, other: Ufixed) -> LogicArray: ...
@overload
def __or__(self, other: Float) -> LogicArray: ...
# "let's add another type!"
# *audible groans*
# *someone posts something about the expression problem*
Also, there's a problem in Sfixed with the inheritance hierarchy you have set, Sequence[Logic]
comes before Sequence[Bit]
, so those methods will not be correct. If you swap the order of the classes in the inheritance list, you get an ordering issue. You've found another issue with the type system I have also seen playing around with this stuff, where you can't "re-bind" generic classes in subtypes.
For example, lets say you have a bitwise OR operation that should work on all Sequence[Logic] types. The example above shows only two classes, in reality there are 10. What's the resulting type of a bitwise OR operation between a LogicArray and an Sfixed? Intuitively, it would be the less derived type, essentially doing implicit upcasting on the more derived type, so it would return a LogicArray. How would you write that?
I see now. Your problem is analogous to numpy arrays, which are generic on the "dtype". Essentially, LogicArray
is like numpy.ndarray
and should be generic. Something like,
X = TypeVar('X', bound=Logic)
class LogicArray(Sequence[X], Generic[X]):
@overload
def __or__(self, other: Self) -> Self:
...
@overload
def __or__(self, other: LogicArray[Any]) -> LogicArray[Any]:
...
If you need more clever type promotion, then I'd look at how numpy
did it.
Summarizing the state of the world + conversation here:
Jukka's comment here still very much applies: https://github.com/python/mypy/issues/2922#issuecomment-283038234. There is not a great way to support register
performantly
If you control your type, you can simply inherit from the ABC you want:
class MyInt(numbers.Integral): ...
If you can't actually inherit at runtime, you can lie to the type checker:
if TYPE_CHECKING:
MyIntBase = numbers.Integral
else:
MyIntBase = object
class MyInt(MyIntBase): ...
If you want mypy to check that your MyInt is actually a conformant implementation, you can simply instantiate it:
class MyInt(numbers.Integral): ...
if TYPE_CHECKING:
_ = MyInt()
Curious if that covers all the use cases where you control your type, or if there's additional functionality that register
as a class decorator provides (beyond ergonomics).
@NeilGirdhar
LogicArray is like numpy.ndarray and should be generic
That will also not work, unfortunately. Subtypes have more operations on them. For example, Sfixed + int
is defined, LogicArray + int
is not. My situation is exactly the same as Python's numeric tower and number types (in fact, if I cared I could integrate my types into the numeric tower). If there was a workaround beyond hacking it into type system like all type checkers currently do with int
and float
, or adding support for ABC.register
no one has figured that out yet.
@hauntsaninja If one of Jukka's solutions is on the table, it's fairly obvious that 3 is the safest choice and would not have an appreciable effect on performance. The type checker being more strict than runtime already has precedent in typeshed.
If one of Jukka's solutions is on the table, it's fairly obvious that 3 is the safest choice and would not have an appreciable effect on performance.
Totally agree with shooting for number 3.
That will also not work, unfortunately. Subtypes have more operations on them. For example, Sfixed + int is defined, LogicArray + int is not. My situation is exactly the same as Python's numeric tower and number types
Yes, okay, that makes sense!
Mypy could support cases like this by special casing
ABCMeta.register
calls.__subclasshook__
,__subclasscheck__
, and__instancecheck__
are more troublesome and could be left ignored.