Open ggoretkin-bdai opened 1 year ago
Hey @ggoretkin-bdai,
I'm afraid that there might not be a perfect solution for this. One way is to automatically apply Val
, like this:
from functools import wraps
from plum import Val, dispatch
class Type1:
def __init__(self, data):
self.data = data
class Type2:
def __init__(self, data):
self.data = data
def makeval(f):
@wraps(f)
def f_wrapped(*args, **kw_args):
return f(Val(args[0]), *args[1:], **kw_args)
return f_wrapped
@makeval
@dispatch
def convert(to: Val[Type2], from_: Type1) -> Type2:
return Type2(from_.data)
x = convert(Type2, Type1("arguments"))
assert isinstance(x, Type2)
assert x.data == "arguments"
Note also that Plum implements a convert
function, which roughly works in this way.
I missed the convert
function. Thanks!
Suppose I want to avoid having to choose a name like rational_to_number
(in example from documentation, copied below), especially because I will not plan to call this directly (and if I do, I will do it with plum
's invoke.
from operator import truediv
from plum import conversion_method
@conversion_method(type_from=Rational, type_to=Number)
def rational_to_number(q):
return truediv(q.num, q.denom)
Do you have a recommendation? Can I add a dispatch directly to https://github.com/beartype/plum/blob/80b614f3c820f3107b0a67aed63d066e6d8f0b25/plum/promotion.py#L41-L46 ? If I'm following along correctly, @_dispatch
exists to introduce a separate namespace, specific to Plum.
@ggoretkin-bdai, the name rational_to_number
actually does not matter. You should be able to use a generic name in this way:
@conversion_method(type_from=Rational, type_to=Number)
def perform_conversion(x):
return truediv(x.num, x.denom)
@conversion_method(type_from=Real, type_to=Number)
def perform_conversion(x):
... # Do something here
In fact, you can even give it a lambda
: conversion_method(Rational, Number)(lambda x: truediv(x.num, x.denom))
.
After having defined these, you should be able to use convert
as in the example.
@ggoretkin-bdai I'll report back once I can test this, but I'm pretty sure this is exactly the use case of typing.Type
. I've linked to the mypy docs there, but it's based off of a PEP so beartype (and consequently, plum) should work with these kinds of annotations out-of-the-box.
@rtbs-dev thanks for pointing this out! I gave it a try here and it seems to work. Unfortunately, the performance is worse than using plum.convert
, but I have not dug in.
Profiling reveals that is_bearable (beartype/door/_doorcheck.py:317)
is the hot spot.
I believe it would be ideal for plum.convert
to be implemented in terms of typing.Type
, if the performance can match.
Unfortunately, that svg output from py-spy
is not interactive once embedded into a GitHub comment. Here it is as a zip file to block embedding:
profile.zip
@rtbs-dev, indeed thanks for pointing this out! Really nice that it seems to work!!
@ggoretkin-bdai It's unfortunate that the performance is not great. I think that getting the performance on par with plum.convert
is tricky, but might be impossible. I'm thinking the following.
Currently, caching happens based on the type of the arguments. Perhaps we could extend this hashing to values. I.e., if f(x)
is called, currently caching is based on type(x)
, but we could extend this to id(x)
or hash(x)
. The former should always be cheap, whereas the latter may be more expensive. Clearly, if f
is called a lot, this would create a massive cache, do something like an LRU cache with limited size seems like a good option here.
How would something like that sound?
I believe it would be ideal for plum.convert to be implemented in terms of typing.Type, if the performance can match.
@ggoretkin-bdai Just so I'm clear, you are trying to add a dispatch rule to the base convert
method (which is supplied by plum) and want to specify it via the typing.Type
annotation?
Like @wesselb said, the current decorator @conversion_method
expects a type_from:Type[from_T]
and a type_to:Type[to_T]
where from_T
and to_T
could be e.g. type variables with bounds. This reflects the mypy style, so that if you didn't have plum and needed to replicate it, the convert function could look like this:
from_T = TypeVar('from_T', bound=Number)
to_T = TypeVar('to_T', bound=Number)
def convert_to_rational(
from_class: type[from_T],
to_class: type[to_T]
) -> Callable[[from_T],to_T]:
...
@convert_to_rational(from_class=Rational, to_class=Number)
def rational_to_number(num: Rational)-> Number:
...
This matches the julia syntax and is something that plum's convert
takes care of for you. You should never have to write out the type[from_T]
or type[to_T]
, because it's implicit when you call the @add_conversion_method
decorator. It wouldn't quite be right to send a type[]
annotation to a conversion function function definition, however. This is because the type of type[Number]
is object
, which beartype would say is wrong for a convert function expecting a type[Number]
. Note my rational_to_number
definition does NOT use type[Rational]
, but Rational
, because that function expects stuff whose type is Rational
, not stuff whose type is type[Rational]
. It's the decorator that expects a type[T]
, which in this case is already defined for you.
AFAICT, it's the right thing for the type signatures in the dispatcher to use the type[T]
, while the dispatched functions use types themselves, without the type[]
wrapping. This:
@dispatch
def convert(to: Val[Type2], from_: Type1) -> Type2:
return Type2(from_.data)
shouldn't ever really be necessary.
@wesselb On the other hand, it may be useful for folks to be able to specify type bounds on a "custom" conversion dispatcher (I noticed your conversion/promotion code uses a global dispatcher specific to that file).
I see on line 33 that you're looking to avoid type()
calls, and while I don't know if it's possible in general, it's the purpose that the new beartype plugin API will fit: __beartype_hint__
protocol will allow users to specify directly what the typehint of a "thing X" that they defined should be, a priori. So you'd be able to access the "type" of a thing immediately without running the type()
function, if the object passed conforms to the beartype_hint protocol.
...I realize now that this is kinda what we've been talking about for ages, since my original proposal back pre-plum-2, so it's fun to see this lovely type()
function popping up again! :sweat_smile:
@ggoretkin-bdai Just so I'm clear, you are trying to add a dispatch rule to the base
convert
method (which is supplied by plum) and want to specify it via thetyping.Type
annotation?
Yes, and furthermore, remove the special-case behavior here: https://github.com/beartype/plum/blob/80b614f3c820f3107b0a67aed63d066e6d8f0b25/plum/promotion.py#L33-L34
It should be possible to call convert
directly, instead of invoking the specific method with type_to
.
To define a new convert method
from plum import @dispatch, convert
from typing import Type
@dispatch
def convert(from_: Type1, to: Type[Type2], ) -> Type2:
return Type2(from_.data)
to invoke it
convert(Type1("blah"), Type2)
No need for any additional decorators, or to invoke methods directly.
(This matches Julia, except it swaps the orders of the argument.)
Currently, caching happens based on the type of the arguments. Perhaps we could extend this hashing to values. I.e., if
f(x)
is called, currently caching is based ontype(x)
, but we could extend this toid(x)
orhash(x)
. The former should always be cheap, whereas the latter may be more expensive. Clearly, iff
is called a lot, this would create a massive cache, do something like an LRU cache with limited size seems like a good option here.
I think the caching you're discussing is this one: https://github.com/beartype/plum/blob/80b614f3c820f3107b0a67aed63d066e6d8f0b25/plum/function.py#L374
and you're proposing caching on self._cache[target]
instead of self._cache[types]
.
I think we want the following two situations to appear identical to plum
and its caching:
from plum import dispatch
@dispatch
def f(x : int):
return 2*x
for i in range(1000):
f(i)
and
for i in range(1000):
f(42)
One idea: use something like
def argkey(x):
if isinstance(x, type):
return (type(x), x)
else:
return (type(x), None)
but I don't know it would work with invoke
. There is no x/value to use in https://github.com/beartype/plum/blob/80b614f3c820f3107b0a67aed63d066e6d8f0b25/plum/function.py#L409. We could make it so that it is not possible to use the exiting invoke
interface to invoke methods that dispatch on typing.Type
.
After reading the julia docs a bit, I have a few notes:
Base.convert(::Type{CompoundPeriod}, x::Period) = CompoundPeriod(Period[x])
is afaict the way that multiple dispatch occurs in julia... i.e. we are telling the dispatcher to assign the CompoundPeriod
constructor, called with argument Period[x]
, to the Base.convert
function whenever a ::Type{CompoundPeriod}, x::Period
type signature is encountered in its arguments.
In the future we would be able to call convert with a convert(CompoundPeriod, x=myPeriodInstance)
. (NB we don't use Type{CompoundPeriod}
when actually calling the convert
method, but we need to use it when defining a new dispatch rule)
So the equivalent statement to the julia above, in plum, would be
plum.add_conversion_method(type_from=Period, type_to=CompoundPeriod, f=CompoundPeriod)
However, if we were wanting to be more julia-esque, we could ignore this convenience function and modify the dispatcher directly, like this:
from plum import convert
@convert.dispatch
def convert(x:Period, to_type:type[CompoundPeriod]): # no unnamed variable type annotations!
return CompoundPeriod(x)
which is exactly the type of signature we used in julia, with the result being that we call convert
like convert(myPeriodInstance, CompoundPeriod)
, not using type[CompoundPeriod]
.
A few thoughts:
Personally I find the add_conversion_method to be fairly nice, in that I don't need to interract with the dispatcher directly, and I can use plum's conversion via its own conversion dispatcher without interfering in my own namespace's dispatcher
because promotion.py
uses a _dispatch=Dispatcher()
, that is a locally-scoped dispatcher! Consequently, I don't quite have enough understanding on how the @myfunc.dispatch
syntax works, except to literally copy exactly what the add_conversion_method
does already:
@_convert.dispatch def perform_conversion(obj: type_from, _: type_to): return f(obj)
All thats missing here is the type annotations, which are worked around in the wrapped function using `resolve_type_hint.
@wesselb you can probably get significant speed-ups in your conversion dispatch by making the base definition take advantage beartype checks on Type
, or, failing that, the beartype.vale.IsSubclass
api, rather than an if is_bearable: ...
style for this function
T = TypeVar('T', bound=Type)
@_dispatch
@beartype
def _convert(
obj: Annotated[object, IsInstance[T]],
type_to: Annotated[object, IsSubclass[T]]
)->T:
## _convert(obj: T, type_to: Type[T])->T
return obj
(this being the trivial case where obj
is already a type_to
.)
@leycec am I right on that? I haven't used the IsSubclass[Type]
pattern before, so I'm not 100% on this ... I just remember if/else being a lot slower in the beartype backend than the Validator API, as a rule. Also.... I don't remember if type variables work inside an annotated like that, lol. :sweat_smile:
In terms of syntax, I think that something like
@convert.dispatch
def convert(to: Type[A], b: B):
...
and then calling it as convert(A, b)
would be most similar to how Julia does it.
(This matches Julia, except it swaps the orders of the argument.)
Yeah, this is very painful. :( I've been wanted to swap around the arguments for a long time, but changing that now seems like a terrible idea.
@wesselb On the other hand, it may be useful for folks to be able to specify type bounds on a "custom" conversion dispatcher (I noticed your conversion/promotion code uses a global dispatcher specific to that file).
@rtbs-dev, what do you mean by this? Could you give an example?
I see on line 33 that you're looking to avoid type() calls, and while I don't know if it's possible in general,
Ahh! That's very interesting. I will have a closer look at this API!
One idea: use something like
@ggoretkin-bdai, yes, that's not a bad idea! I think something like that could work. You're right that it wouldn't work with invoke
, so we'd have to redesign that part.
Personally I find the add_conversion_method to be fairly nice,
@rtbs-dev, whatever we do, it should be possible to keep this interface, so a user could use it if they would want to.
@wesselb you can probably get significant speed-ups in your conversion dispatch by making the base definition take advantage beartype checks on Type,
Interesting! I agree that there might be something to gain there. However, I suspect that if we want to get this really fast, we really have to cache the whole dispatch process.
@ggoretkin-bdai I took a minute to hack together id
-based caching:
Consider the following benchmark script:
import timeit
from typing import Type
from plum import dispatch, convert, add_conversion_method
class A:
def __init__(self, x):
self.x = x
class B:
def __init__(self, x):
self.x = x
@dispatch
def my_convert(to: Type[A], b: B):
return A(b.x)
add_conversion_method(B, A, lambda b: A(b.x))
def convert_no_overhead(to, b):
return A(b.x)
b = B(1)
print(timeit.timeit(lambda: my_convert(A, b), number=10000) / 10000 * 1e6, "us")
print(timeit.timeit(lambda: convert(b, A), number=10000) / 10000 * 1e6, "us")
print(timeit.timeit(lambda: convert_no_overhead(A, b), number=10000) / 10000 * 1e6, "us")
This gives output
2.728277499999998 us
7.6240096000000035 us
0.5349711999999895 us
Hence, I think something like the argkey
which you're proposing might be the nicest way to go about this. I'm just wondering if the isinstance(x, type)
is truly robust and the most correct way to go about it.
I want to check if we're aligned in the high-level philosophy here, so I'm going to try to explain my understanding of the situation.
It's clear to me that plum
should have a convert
function, and related promotion functionality, the same way that Julia has Base.convert
. The multiple-dispatch paradigm is hampered if each Python package defines its own convert
in its own namespace.
It's also clear to me that up until now, plum
has required special treatment for these conversion methods, because there has not been a clear and efficient way to dispatch on convert(x, TypeFoo)
and convert(x, TypeBar)
. I see the "# TODO: Can we implement this without using type
?!" as being recognition of this special treatment, and the desire to avoid it.
If we are able to find a satisfactory solution to the general problem of dispatching on convert(x, TypeFoo)
and convert(x, TypeBar)
(which, to be clear, is different from the problem of dispatching on convert(x, my_foo)
and convert(x, my_bar)
, then we remove the need for special treatment.
To reiterate, plum
should still define a convert function (because any python package doing multiple dispatch will have access to the shared plum namespace), but there will be no need for special handling. Right now, a user needs to use plum.@add_conversion_method
, and the user should not add methods to plum._convert
or call plum._convert
. Furthermore, I don't think its correct to, in the current system, add any dispatches to plum.convert
("modify the dispatcher directly" in https://github.com/beartype/plum/issues/85#issuecomment-1625573211), because then there are two levels of dispatching that happen, once at convert
and the other at _convert
.
It is amazing that there is a typing.Type
, that beartype
handles it correctly, and so plum works "out-of-the-box" with this. I don't think this was known when promotion.py
was being written, because I think it would have changed the design to avoid the special treatment giving different semantics to method invocation. My notional model for the standard semantics is something like
some_function.invoke(type(x), type(y))(x, y)
and the alternate semantics are what we see here https://github.com/beartype/plum/blob/80b614f3c820f3107b0a67aed63d066e6d8f0b25/plum/promotion.py#L33-L34C12
Note that my notional model is not what happens in Julia. You can't just take the typeof
each of the arguments and dispatch on that information, otherwise you would not be able to dispatch on convert(Int, x)
vs convert(String, x)
.
Note also that Julia has the following situation:
julia> typeof(Int)
DataType
julia> Int isa DataType
true
julia> Int isa Type{Int}
true
julia> Type{Int} <: DataType # subtyping relation
true
Trying to do the analogous thing in Python:
In [1]: import typing
In [2]: type(int)
Out[2]: type
In [3]: isinstance(int, type)
Out[3]: True
In [4]: isinstance(int, typing.Type[int])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[4], line 1
----> 1 isinstance(int, typing.Type[int])
File /usr/lib/python3.10/typing.py:994, in _BaseGenericAlias.__instancecheck__(self, obj)
993 def __instancecheck__(self, obj):
--> 994 return self.__subclasscheck__(type(obj))
File /usr/lib/python3.10/typing.py:997, in _BaseGenericAlias.__subclasscheck__(self, cls)
996 def __subclasscheck__(self, cls):
--> 997 raise TypeError("Subscripted generics cannot be used with"
998 " class and instance checks")
TypeError: Subscripted generics cannot be used with class and instance checks
In [5]: issubclass(typing.Type[int], type)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[5], line 1
----> 1 issubclass(typing.Type[int], type)
TypeError: issubclass() arg 1 must be a class
That's all to say that in Julia, the notional model is simpler because given some arguments, you are looking for the method with the most specific "types" match for "arguments" via the Julia arg isa type
relation (i.e. Int isa Type{Int}
). In Python, this doesn't work, because typing.Type
is only for annotation purposes. Perhaps bearable
defines, or should define something like isinstanceof
that would return true in isinstanceof(int, typing.Type[int])
.
I have been summoned... but have no idea what I'm doing! I'm painfully out of my intellectual depth in deep academic discourse like this. That said...
Profiling reveals that
is_bearable
(beartype/door/_doorcheck.py:317
) is the hot spot.
ohgods
is_bearable()
is actually fairly well-optimized. So, this saddens me. I can't quite recall if there's any additional low-hanging efficiency fruit I can pluck for that function. Until my life erupts in copious free time, we might be momentarily "stuck" with that hot spot. :hot_face:
@wesselb you can probably get significant speed-ups in your conversion dispatch by making the base definition take advantage beartype checks on
Type
, or, failing that, thebeartype.vale.IsSubclass
api, rather than anif is_bearable: ...
style for this function:
Indeed! @beartype already deeply supports typing.Type[...]
type hints in O(1)
time with negligible constants. That's good. Sadly, this...
T = TypeVar('T', bound=Type)
@_dispatch
@beartype
def _convert(
obj: Annotated[object, IsInstance[T]],
type_to: Annotated[object, IsSubclass[T]]
)->T:
## _convert(obj: T, type_to: Type[T])->T
return obj
...is probably less good. Neither the IsInstance[...]
nor IsSubclass[...]
validators currently support type variables (i.e., TypeVar
objects), which sorta scuttles that. The original function signature def _convert(obj: T, type_to: Type[T]) -> T:
is still probably the fastest and most accurate means of expressing those type constraints.
Probably. That said, I wear a bear hat while walking about the house. I am not to be trusted.
@ggoretkin-bdai I agree with your message on all points! :)
I also think that Plum should define a convert
function, and ideally this function should be defined without any invoke
-hacks, which thus far have been required. With slightly more clever caching, the IMO ideal and much simpler construction with typing.Type
might just work. I indeed wasn't aware of typing.Type
when I wrote promotion.py
.
Trying to do the analogous thing in Python:
@ggoretkin-bdai, using beartype.door.TypeHint
, things actually do work analogously:
>>> from beartype.door import TypeHint, is_bearable
>>> import typing
>>> is_bearable(int, type)
True
>>> is_bearable(int, typing.Type[int])
True
>>> TypeHint(typing.Type[int]) <= TypeHint(type)
True
@leycec, it's the bearman! Thanks for confirming what the fastest way of doing this is. In this case, IMO the issue lies with Plum, which doesn't yet implement the optimal caching behaviour.
To summarise, to move forward, I would propose the following two steps:
Agree on a way to implement caching that makes typing.Type
fast. I think that @ggoretkin-bdai's argkey
approach sounds very sensible, but the situation is slightly more complicated due to the notion of "faithfulness". I'll have to more carefully think about this.
Rework convert
to the much more simpler approach of using typing.Type
, but keep the add_conversion_method
decorator as an interface (@rtbs-dev).
Idiomatic Julia defines methods such as the following
To dispatch to this method, the invocation looks like:
What is the equivalent of
Type
(in the definition) that is available inplum
? I considered two attempts usingplum.Val
andtyping.Literal
. I roughly understandVal
to be a stopgap untilLiteral
was available. But it's not possible to instantiate atyping.Literal
.Attempt 1 error:
Attempt 2 error:
Related: https://github.com/beartype/plum/issues/70