Open tek opened 6 years ago
I think this came up few times in other discussions, for example one use case is https://github.com/python/mypy/issues/4395. But TBH this is low priority, since such use cases are quite rare.
damn, I searched very thoroughly but did not find this one! :smile: So, consider this my +1!
Adding your +1 is all fine, but who's going to do the (probably pretty complicated) implementation work?
are you approving the feature?
I am neither approving nor disapproving. Just observing that it may be a lot of work for marginal benefits in most codebases.
awfully pragmatic. Where's your sense of adventure? :smile: anyways, I'll work on it, though it's gonna take a while to get into the project.
Similar to https://github.com/Microsoft/TypeScript/issues/1213
Not sure if the discussion over there provides any useful insights to the effort over here.
Hi @tek. I'm also very interested in this, so I'd like to ask if you had any progress with this and volunteer to help if you want.
@rcalsaverini sorry, I've been migrating my legacy code to haskell and am abandoning python altogether. but I wish you great success!
Oh, sad to hear but I see your point. Thanks.
Just to add another use case (which I think relates to this issue):
Using Literal types along with overloading __new__
, along with higher-kinded typevars could allow implementing a generic "nullable" ORM Field class, using a descriptor to provide access to the appropriate nullable-or-not field values. The descriptor wouldn't have to be reimplemented in subclasses.
It is one step closer to being possible due to the most recent mypy release's support for honoring the return type of __new__
(https://github.com/python/mypy/issues/1020).
Note: this is basically a stripped-down version of Django's Field class:
# in stub file
from typing import Generic, Optional, TypeVar, Union, overload, Type
from typing_extensions import Literal
_T = TypeVar("_T", bound="Field")
_GT = TypeVar("_GT")
class Field(Generic[_GT]):
# on the line after the overload: error: Type variable "_T" used with arguments
@overload
def __new__(cls: Type[_T], null: Literal[False] = False, *args, **kwargs) -> _T[_GT]: ...
@overload
def __new__(cls: Type[_T], null: Literal[True], *args, **kwargs) -> _T[Optional[_GT]]: ...
def __get__(self, instance, owner) -> _GT: ...
class CharField(Field[str]): ...
class IntegerField(Field[int]): ...
# etc...
# in code
class User:
f1 = CharField(null=False)
f2 = CharField(null=True)
reveal_type(User().f1) # Expected: str
reveal_type(User().f2) # Expected: Union[str, None]
I wonder if this is what I need or if there's currently a work around for my (slightly simpler) case?:
I'm building an async redis client with proper type hints. I have a "Commands" class with methods for all redis commands (get
, set
, exists
, strlen
... and hundreds more). Normally each of those methods should return a future (actually coroutine) to the result, but in pipeline mode they should all return None
- the commands are added to the pipeline to be executed later.
This is easy enough to implement in python, but not so easy to type hint correctly.
Basic example:
class Redis:
def execute(self, command) -> Coroutine[Any, Any, Union[None, str, int, float]]:
return self.connection.execute(...)
def get(self, *args) -> Coroutine[Any, Any, str]:
...
return self.execute(command)
def set(self, *args) -> Coroutine[Any, Any, None]:
...
return self.execute(command)
def exists(self, *args) -> Coroutine[Any, Any, bool]:
...
return self.execute(command)
# ... and many MANY more ...
class RedisPipeline(Redis):
def execute(self, command) -> None:
self.pipeline.append(command)
I tried numerous options to make Coroutine[Any, Any, xxx]
generic, but nothing seems to work.
Is there any way around this with python 3.8 and latest mypy? If not a solution would be wonderful - as far as I can think, my only other route for proper types is a script which copy and pastes the entire class and changes the return types in code.
@samuelcolvin I don't think this question belongs in this issue. The reason for the failure (knowing nothing about Redis but going purely by the code you posted) is that in order to make this work, the base class needs to switch to an Optional
result, i.e.
def execute(self, command) -> Optional[Coroutine[Any, Any, Union[None, str, int, float]]]:
I get that, but I need all the public methods to definitely return a coroutine. Otherwise, if it returned an optional coroutine, it would be extremely annoying to use.
What I'm trying to do is modify the return type of many methods on the sub-classes, including "higher kind" types which are parameterised.
Hence thinking it related to this issue.
Honestly I have no idea what higher-kinded type vars are -- my eyes glaze over when I hear that kind of talk. :-)
I have one more suggestion, then you're on your own. Use a common base class that has an Optional[Coroutine[...]]
return type and derive both the regular Redis class and the RedisPipeline class from it.
Okay, so the simple answer is that what I'm trying to do isn't possible with python types right now.
Thanks for helping - at least I can stop my search.
I suspect that the reason is that it simply isn't type-safe, and you couldn't do it (using subclassing) in any other typed language either.
humm, but the example above under "Basic example" I would argue IS type-safe.
All the methods which end return self.execute(...)
return what execute
returns - either a Coroutine
or None
.
Thus I don't see how this as any more "unsafe" than normal use of generics.
@gvanrossum, I can relate!
I wonder if bidict provides a practical example of how this issue prevents expressing a type that you can actually imagine yourself needing.
>>> element_by_atomicnum = bidict({0: "hydrogen", 1: "helium"})
>>> reveal_type(element_by_atomicnum) # bidict[int, str]
# So far so good, but now consider the inverse:
>>> element_by_atomicnum.inverse
bidict({"hydrogen": 0, "helium": 1})
What we want is for mypy to know this:
>>> reveal_type(element_by_atomicnum.inverse) # bidict[str, int]
merely from a type hint that we could add to a super class. It would parameterize not just the key type and the value type, but also the self type. In other words, something like:
KT = TypeVar('KT')
VT = TypeVar('VT')
class BidirectionalMapping(Mapping[KT, VT]):
...
def inverse(self) -> $SELF_TYPE[VT, KT]:
...
where $SELF_TYPE
would of course use some actually legal syntax that allowed composing the self type with the other parameterized types.
Okay, I think that example is helpful. I recreated it somewhat simpler (skipping the inheritance from Mapping
and the property decorators):
from abc import abstractmethod
from typing import *
T = TypeVar('T')
KT = TypeVar('KT')
VT = TypeVar('VT')
class BidirectionalMapping(Generic[KT, VT]):
@abstractmethod
def inverse(self) -> BidirectionalMapping[VT, KT]:
...
class bidict(BidirectionalMapping[KT, VT]):
def __init__(self, key: KT, val: VT):
self.key = key
self.val = val
def inverse(self) -> bidict[VT, KT]:
return bidict(self.val, self.key)
b = bidict(3, "abc")
reveal_type(b) # bidict[int, str]
reveal_type(b.inverse()) # bidict[str, int]
This passes but IIUC you want the ABC to have a more powerful type. I guess here we might want to write it as
def inverse(self: T) -> T[VT, KT]: # E: Type variable "T" used with arguments
Have I got that?
Exactly! It should be possible to e.g. subclass bidict (without overriding inverse), and have mypy realize that calling inverse on the subclass gives an instance of the subclass (with the key and value types swapped as well).
This isn’t only hypothetically useful, it’d really be useful in practice for the various subclasses in the bidict library where this actually happens (frozenbidict, OrderedBidict, etc.).
Glad this example was helpful! Please let me know if there’s anything further I can do to help here, and (can’t help myself) thanks for creating Python, it’s such a joy to use.
Ah, so the @abstractmethod
is also a red herring.
And now I finally get the connection with the comment that started this issue.
But I still don't get the connection with @samuelcolvin's RedisPipeline class. :-(
I would also say that this example is really simple, common, but not supported:
def create(klass: Type[T], value: K) -> T[K]:
return klass(value)
We use quite a lot of similar constructs in dry-python/returns
.
As a workaround I am trying to build a plugin with emulated HKT, just like in some other languages where support of it is limited. Like:
bow
: https://bow-swift.io/docs/fp-concepts/higher-kinded-types/fp-ts
: https://github.com/gcanti/fp-ts/blob/master/src/HKT.tsPaper on "Lightweight higher-kinded polymorphism": https://www.cl.cam.ac.uk/~jdy22/papers/lightweight-higher-kinded-polymorphism.pdf
TLDR: So, instead of writing T[K]
we can emulate this by using HKT[T, K]
where HKT
is a basic generic instance processed by a custom mypy
plugin. I am working on this plugin for already some time now, but there's still nothing to show. You can track the progress here: https://pypi.org/project/kinds/ (part of dry-python libraries)
But I still don't get the connection with @samuelcolvin's RedisPipeline class. :-(
Sorry if I wasn't clear. I'll try again to explain:
I have a class with many (~200) methods, they all return coroutines with different result types (None
, bytes
, str
, int
or float
). I want a subclass with the same internal logic but where all those methods return None
- I can do this in python, but not with type hints currently (here's the actual code if it helps)
So roughly I want:
T = TypeVar('T', bytes, str, int, float, 'None')
Result = Coroutine[Any, Any, T]
class Foo:
def method_1(self, *args) -> Result[str]:
...
def method_2(self, *args) -> Result[None]:
...
def method_3(self, *args) -> Result[bool]:
...
...
...
def method_200(self, *args) -> Result[int]:
...
class Bar:
def method_1(self, *args) -> None:
...
def method_2(self, *args) -> None:
...
def method_3(self, *args) -> None:
...
...
...
def method_200(self, *args) -> None:
...
Except I don't want to have to redefine all the methods on Bar
. Assuming I could create my own AlwaysNone
type which even when parameterised told mypy the result would always be None
class AlwaysNoneMeta(type):
def __getitem__(self, item) -> None:
return None
class AlwaysNone(metaclass=AlwaysNoneMeta):
pass
I think the feature requested here could solve my problem.
In other words if mypy could understand
def generate_cls(OuterType):
class TheClass:
def method_1(self, *args) -> OuterType[str]:
...
...
return TheClass
Foo = generate_cls(Result)
Bar = Generate_cls(AlwaysNone)
I'd be in the money, but quite understandably it can't.
I've currently got a working solution where i generate a .pyi
stub file with definitions for all these methods but changing the return type to None
(see here) so I'm not in immediate need of this anymore.
It looks like you want completely different signatures whose likeness is limited to their names and arguments:
async def do_work(redis_client: Redis):
await redit_sclient.get(...)
do_work(RedisPipeline()) # user expects type checker to warn about misuse
It's understandable you're looking to reuse some common "template", but I fail to see what it has to do with higher-kinded types.
Hi @Kentzo, that's incorrect, these are not completely different signatures.
Please review the code here; as you can see these are functions which may return an awaitable object, not coroutines. This approach and similar, while not necessarily obvious to beginners, is relatively common in libraries which make extensive use of asyncio and the standard library asyncio code.
If you're not sure how it works, feel free to submit an issue on that project and I'll endeavour to explain it to you.
In your initial example:
class Redis:
def execute(self, command) -> Coroutine[Any, Any, Union[None, str, int, float]]:
...
class RedisPipeline(Redis):
def execute(self, command) -> None:
These are different signatures. Functions annotated to accept Redis
as an argument would expect execute
to return a coroutine.
This is off-topic, please create an issue on async-redis if you want to discuss this more.
In short, yes they're different signatures, but I think my explanation above gives enough detail on what I'm doing and why I think it relates to this issue.
+1 on this. It's hard to say the use cases are rare when it's not even possible to express... There may be countless use cases hiding behind duck typing that simply can't be checked by the type system and therefore certainly aren't recognized as such. Having a type safe generic "Mappable" class as shown in the OP would be really useful.
It's hard to say the use cases are rare when it's not even possible to express
That makes no sense. You can express the OP's example just fine in untyped Python, you just can't express the types correctly. So if there were "countless" real examples we would encounter frequent user requests for type system extensions to allow typing this correctly. I haven't seen a lot of those.
Do you have an actual (non-toy) example?
You actually can have Higher Kinded Types in Python with full mypy
support.
Here's how Mappable
looks like from OP:
Mappable
aka Functor
: https://github.com/dry-python/returns/blob/master/returns/interfaces/mappable.py@sobolevn That's awesome! Thank you!
@gvanrossum Inexpressible in the type system is what I was referring to. My point was that writing this off as rare seems premature since Python's types system is not a hard stop on your ability to keep going. So it might be actually quite common and how would we know? Certainly you're right that it's not a common request though which is a good point and slightly different than mine. However, it's all moot now because it can be done! Awesome!
Do you have an actual (non-toy) example?
An example would be the dict.fromkeys(...)
method, as the output type is variable with respect to both the dict
(sub-)class as well as the key type (see https://github.com/python/typeshed/issues/3800). This example above can, admittedly, be more or less resolved by explicitly re-annotating aforementioned method in every single subclass.
Another less easily resolved example one would be the np.asanyarray(a, dtype=..., ...)
function in numpy. Per the documentation: "Convert the input to an ndarray, but pass ndarray subclasses through.".
What this means is that, similar to dict.fromkeys(...)
, the output is variable with respect to both any passed ndarray
subclasses (the a
parameter) as well as the type of the embedded scalars (the dtype
parameter).
In [1]: import numpy as np
In [2]: class TestArray(np.ndarray):
...: ...
...:
In [3]: array: TestArray = np.array([0, 1, 2]).view(TestArray)
In [4]: array # array of integers
Out[4]: TestArray([0, 1, 2])
In [5]: np.asanyarray(array, dtype=float) # array of floats
Out[5]: TestArray([0., 1., 2.])
Thanks, that was helpful.
Ran into this today:
from typing import Any, Generic, Mapping, TypeVar
ProblemState = TypeVar('ProblemState')
Action = TypeVar('Action')
class Problem(Generic[ProblemState, Action]):
pass
ProblemT = TypeVar('ProblemT', bound=Problem[ProblemState, Action])
class Solution(Generic[ProblemT, ProblemState, Action]):
pass
I could have worked around it with an intersection operator:
ProblemU = TypeVar('ProblemU', bound=Problem[Any, Any])
class Solution(Generic[ProblemU, ProblemState, Action]):
ProblemT = Intersection[ProblemU, Problem[ProblemState, Action]]
Would intersection help in the general case without requiring as much effort?
Just ran into this limitation when trying to type methods of a generic class that return an instance of the same generic class but with other arguments. This is actually very common for classes that can easily have some properties changed, like numpy arrays or torch tensors and their device
(gpu or cpu) or dtype
(int
, float
, ...):
DeviceTransferableT = TypeVar("DeviceTransferableT", bound="DeviceTransferable")
class DeviceTransferable(Generic[DeviceT]):
def to_device(self: DeviceTransferableT, device: Type[DeviceT]) -> DeviceTransferableT[DeviceT]: # error!
... # some great implementation
class SomeOtherClass(DeviceTransferable):
"""Would prefer not to re-implement `to_device` in every inheriting class."""
Is there any other solution?
@florensacc Can you expand the example so that I can actually run it by mypy? A few things are missing and I can’t guess what their definitions should be.
Here's a full example, which throws two errors: error: Type variable "DeviceTransferableT" used with arguments
and Argument 1 to "f" has incompatible type "SomeOtherClass[Device0]"; expected "SomeOtherClass[Device1]"
.
import attr
import abc
from typing import Type, TypeVar, Generic, ClassVar
class Device(abc.ABC):
device_id: ClassVar[int]
class Device0(Device):
device_id = 0
class Device1(Device):
device_id = 1
DeviceT = TypeVar("DeviceT", bound=Device)
AnotherDeviceT = TypeVar("AnotherDeviceT", bound=Device)
DeviceTransferableT = TypeVar("DeviceTransferableT", bound="DeviceTransferable")
@attr.s()
class DeviceTransferable(Generic[DeviceT]):
device: DeviceT = attr.ib()
def to_device(self: DeviceTransferableT, device: Type[AnotherDeviceT]) -> DeviceTransferableT[AnotherDeviceT]:
return attr.evolve(self, device=device)
@attr.s()
class SomeOtherClass(DeviceTransferable[DeviceT]):
"""Would prefer not to re-implement `to_device` in every inheriting class."""
pass
C_in0 = SomeOtherClass(device=Device0())
C_in1 = C_in0.to_device(Device1)
def f(c: SomeOtherClass[Device1]) -> None:
print(f"The device of c is {c.device}, but its type is still SomeOtherClass[Device0]!")
f(C_in1)
Oh, @attr.s :-(
Can you do it without?
sure, you can remove all attr.s stuff and use this class definition:
class DeviceTransferable(Generic[DeviceT]):
def __init__(self, device: DeviceT):
self.device = device
def to_device(self: DeviceTransferableT, device: Type[AnotherDeviceT]) -> DeviceTransferableT[AnotherDeviceT]:
return type(self)(device=device)
without the "argument" to the TypeVar
this code runs happily and prints my message:
The device of c is <class '__main__.Device1'>, but its type is still SomeOtherClass[Device0]!
Okay, I get it. If/when we fix this we'll make sure to have a test like that. Thanks!
We have released a first prototype of Higher Kinded Types emulation for Python.
It is available for everyone to try! Quick demo: https://gist.github.com/sobolevn/7f8ffd885aec70e55dd47928a1fb3e61
Here's how a function's signature will look like:
from returns.primitives.hkt import Kind1, kinded
_InstanceKind = TypeVar('_InstanceKind', bound='HasValue')
@kinded
def apply_function(
instance: Kind1[_InstanceKind, _ValueType],
callback: Callable[[_ValueType], _NewValueType],
) -> Kind1[_InstanceKind, _NewValueType]:
...
Source code: https://github.com/dry-python/returns Docs: https://returns.readthedocs.io/en/latest/pages/hkt.html and https://sobolevn.me/2020/10/higher-kinded-types-in-python
See https://github.com/ceph/ceph/pull/38953 for a real world use case
Finally found this issue after running into TypeError: 'TypeVar' object is not subscriptable
and searching around for the right way to do this. Turns out there isn't. Has there been any change in status for the implementation? The use-case might be limited for the annotations themselves, but if utilized in a couple core libraries, the end-user experience could be greatly improved by the resulting hints in IDEs.
I empathize with @tek, but unfortunately can't leave Python for the strongly-typed shores of Scala or Haskell, due to Python's ML ecosystem and relatively easy learning curve.
Hm, I believe I've finally run into this. Here's my example, let me know if I'm misunderstanding something.
Let's say you're adding static types to a Mongo ODM/ORM-type framework. You define your model using a class.
@dataclass
class User:
__collection__ = "users" # This is the "table", or collection in Mongo. Class variable, not instance
_id: ObjectId # Every Mongo document has an ID, most often ObjectId, but can be maybe str or something else
username: str # A random property
Now you write a function to query for this model. We write a generic protocol for models, parametrizing by the ID type:
ID = TypeVar("ID")
class MongoModel(Protocol[ID])
__collection__: str
@property
def _id(self) -> ID:
...
Our User
class is thus a MongoModel
. Now the actual querying function. We can query just passing in the ID.
C = TypeVar("C", bound=MongoModel)
def find_one(model_cls: Type[C], id: ???) -> C: ...
There's nothing to put instead of ???
, is there?
My first instinct was to try:
C = TypeVar("C", bound=MongoModel)
TID = TypeVar("TID")
def find_one(model_cls: Type[C[TID], id: TID) -> C: ...
But no dice, obviously.
Second approach: I hoped Mypy would realize MongoModel
already references ID
(since it's defined with it), so I can write this:
C = TypeVar("C", bound=MongoModel)
def find_one(model_cls: Type[C], id: ID) -> C: ...
But no, Mypy treats them as two separate type variables (in other words, I can pass in an int and it doesn't catch it).
And C
cannot be changed to TypeVar("C", bound=MongoModel[ID])
either.
Just to extend the list of people encountering it, I also ran into this. My example is similar to the ones above in spirit but aggravated by extraction of the generic class at runtime, which is possible since python 3.8. I am not sure whether it is officially recommended but I really like this mechanism :). Unfortunately, using it will very often require subscriptable TypeVars, like in the sketch below.
For simplicity I slightly changed the real signature. In real use cases I load the registry from a file and not from prepared list, so polymorphism makes a bit more sense there.
class SettingsRegistry(Generic[TSettings], ABC):
@classmethod
def from_serialized_settings(cls: Type[TSettingsRegistry], serialized_settings: List[str]) -> TSettingsRegistry[TSettings]:
settings_class = get_args(cls.__orig_bases__[0])[0]
deserialized_settings = [settings_class.deserialize(setting) for setting in serialized_settings]
return cls(deserialized_settings)
This way I can avoid code duplication but typing TSettingsRegistry[TSettings]
is currently disallowed, of course.
Just another use for this: use the new TypeGuard
to check the types of a collection:
T = TypeVar("T", bound=Collection)
V = TypeVar("V")
def all_elements_type(
collection: T[object],
element_type: Type[V],
) -> TypeGuard[T[V]]:
return all(isinstance(t, element_type) for t in collection)
This is very useful and currently not possible AFAIK.
Another potential use case example for HKTs:
https://toolz.readthedocs.io/en/latest/_modules/toolz/dicttoolz.html#valmap
I've made my own typed version of it, like so:
K = TypeVar("K")
V = TypeVar("V")
M = TypeVar("M")
def valmap(
func: Callable[[V], M],
d: Mapping[K, V],
factory: Type[MutableMapping[K, M]] = dict,
) -> MutableMapping[K, M]:
rv = factory()
rv.update(zip(d.keys(), map(func, d.values())))
return rv
but in order to preserve type information, I would much rather want to type it as:
K = TypeVar("K")
V = TypeVar("V")
M = TypeVar("M")
F = TypeVar("F", bound=MutableMapping)
def valmap(
func: Callable[[V], M],
d: Mapping[K, V],
factory: Type[F] = dict,
) -> F[K, M]:
rv = factory()
rv.update(zip(d.keys(), map(func, d.values())))
return rv
@gvanrossum makes the point that this isn't a commonly requested feature. I think that is correct to some extent, although the level of activity in this thread is making me slightly reconsider. I think it isn't commonly requested because it is a fairly advanced technique. But, I think my example, as well as many other examples in this thread are showing that fundamental library code in fairly prevalent use would be able to benefit from HKT in the sense that annotations would more accurately describe actual behavior. Often times this library code is maintained by skilled programmers that could leverage HKT if available. This is a benefit that then trickles down to the wider python ecosystem, including users that have never heard of HKT, but might leverage type checkers. You don't need to be a wizard to enjoy more accurate lib function call return types.
@harahu - That is an excellent example. I think you described the situation perfectly. I've been in the exact same spot of wanting to write a transformation over a map-like collection, and resorting to some zany nested type signatures, when really all I wanted to do was turn some D[K, V]
into a F[K, M]
based on some provided type F
and a transform func(v: V) -> M
.
HKT is a fairly sophisticated tool that many folks do not even know to reach for, but when you are in a situation which benefits it, many doors are unlocked. HKTs often greatly simplify the type signature, like in the example above. It shows up most frequently when you want to "transmute" some object (typically a container type) into a target (container) type, but you want the internal types to be preserved.
This pattern does not seem to show up frequently when writing application code. Most software engineers probably do not need to know about it if they don't want to. However, when writing library code for others to consume, the pattern shows up quite a lot. Probably because it's a fairly high level abstraction, and that kind of abstraction lends itself to reuse.
In my opinion, HKTs would push the envelop for maintainability and ergonomics for maintainers of libraries that embrace the static typing discipline. In the same vein, it would not add any cognitive overhead for those looking to implement libraries, nor those looking to start writing libraries with static types. If anything, I think it has the potential to shorten the learning curve for those with medium experience with type systems (such as myself). I have definitely gotten myself into some really convoluted type signatures trying to express the transformation I wanted. Desire to use HKTs seems to scale with familiarity with type systems and abstracting over generics.
I think it has the potential to shorten the learning curve for those with medium experience with type systems (such as myself).
💯 When you have generics but not HKT, you end up having to twist yourself into knots trying to express what would otherwise be simple. HKT is merely generics that works everywhere instead of generics that only works in some places.
aka type constructors, generic TypeVars
Has there already been discussion about those? I do a lot of FP that results in impossible situations because of this. Consider an example:
I haven't found a way to make this work, does anyone know a trick or is it impossible? If not, consider the syntax as a proposal. Reference implementations would be Haskell, Scala. optimally, the HK's type param would be indexable as well, allowing for
F[X[X, X], X[X]]
Summary of current status (by @smheidrich, 2024-02-08):
peps
repo. The stub PEP draft so far contains a few examples of the proposed syntax.