beartype / plum

Multiple dispatch in Python
https://beartype.github.io/plum
MIT License
533 stars 24 forks source link

Is there a way to make pycharm understand what plum does? #89

Open gabrieldemarmiesse opened 1 year ago

gabrieldemarmiesse commented 1 year ago

From the readme's example:

from numbers import Number

from plum import dispatch

@dispatch
def f(x: str):
    return "This is a string!"

@dispatch
def f(x: int):
    return "This is an integer!"

@dispatch
def f(x: Number):
    return "This is a general number, but I don't know which type."

Pycharm doesn't understand the signature of the function f. It also can't jump to source according to the correct type. Is this possible to make pycharm work better with plum?

wesselb commented 1 year ago

Hey @gabrieldemarmiesse!

Unfortunately I'm not aware of a way to make PyCharm play nice with Plum. :( Multiple dispatch is a programming pattern that's unfortunately poorly supported by existing Python tooling, not just PyCharm but also e.g. mypy.

The next big step forward for the package would be to write plugins for e.g. mypy that support the multiple dispatch pattern. I'm not sure how difficult this would be, but I'm suspecting that it won't be trivial.

I'm sorry to reply without a solution. I'd love to see support for Plum in PyCharm.

gabrieldemarmiesse commented 1 year ago

No worries, I like plum a lot, but IDE integration prevents us from using it in our company, so yeah. But I can see the potential of this package. I think this was a missing feature of python

wesselb commented 1 year ago

It's a shame, albeit completely reasonable of course, to hear that the lack of IDE integration is what's preventing adoption. I've been meaning to look into how difficult some reasonable degree of integration would be. Once I find some time to look into this, I'll update you in this issue!

machow commented 1 year ago

I don't know how nuts this is in practice, but what about something like this?

from typing import TYPE_CHECKING
from plum import dispatch

if TYPE_CHECKING:
    from typing import overload
else:
    def overload(f):
        return dispatch(f)

@overload
def f(x: int) -> int:
    return x

@overload
def f(x: str) -> str:
    return x

If plum defined overload like above (simply re-exporting typing.overload if type checking), I wonder if a user could do...

from plum import overload

@overload
def f(x: int) -> int:
    return x

@overload
def f(x: str) -> str:
    return x

And then use it not just as a type stub, but also to implement the dispatch. I tested a bit in VS Code and was surprised it showed the overloaded signatures. One downside is that the signature for overload is the one from typing.overload.

machow commented 1 year ago

@wch gave some really helpful feedback on this approach. In python 3.11, overload sets the wrapped function in a registry. In this case, I wonder if plum could use typing.get_overloads() to get all the original functions back...

from typing import overload
from plum import dispatch

@overload
def f(x: int) -> int:
    return x

@overload
def f(x: str) -> str:
    return x

# here, dispatch could use typing.get_overloads to retrieve the functions above
@dispatch
def f(x):
   raise NotImplementedError()

edit: for context, it looks like registering function in overload was added to allow help() to show overloads

wesselb commented 1 year ago

@machow and @wch, ahhhh that's a super interesting approach. How does the below snippet work for you in VS Code? mypy seems to be happy with it on my side!

from typing import Callable, TypeVar, get_overloads, overload

from plum import Function

T = TypeVar("T", bound=Callable)

def dispatch(f: T) -> T:
    f_plum = Function(f)
    for method in get_overloads(f):
        f_plum.dispatch(method)
    return f_plum

@overload
def f(x: int) -> int:
    return x + 1

@overload
def f(x: str) -> str:
    return x

@dispatch
def f(x):
    raise NotImplementedError

print(f(1))
print(f("hey"))
# print(f(1.0))  # Wrong type!
gabrieldemarmiesse commented 1 year ago

This is some galaxy-size brain stuff right here. I like that a lot. I'll experiment with that on some real-world project :)

wesselb commented 1 year ago

@gabrieldemarmiesse Let us know how that works for you!!

gabrieldemarmiesse commented 1 year ago

I tried at and I am very impressed. Pycharm understands the inpout and output, it can't directly jump to the right overload but that's not much of an issue. The only small problem is that we need to use python 3.11 to use this. I wonder if there is a backport of get_overloads() somewhere in the typing_extensions.

I don't even need to write the type hints for the function that raises NotImplementedError. So the UX is great!

I wonder if this pattern can become part of plum? It's a game changer!

wesselb commented 1 year ago

@gabrieldemarmiesse It seems that typing_extensions actually does provide a get_overloads (but which requires the user to use typing_extensions.overload), so that might just work!

In particular, I think I've managed to add plum.overload such that the following works across all Python versions:

from plum.overload import dispatch, overload

@overload
def f(x: int) -> int:
    return x + 1

@overload
def f(x: str) -> str:
    return x

@dispatch
def f(x):
    raise NotImplementedError

print(f(1))
print(f("hey"))

I wonder if this pattern can become part of plum? It's a game changer!

I think so!! I've long been looking for a way to write multiple dispatch that plays nice with type checkings and mypy, and this appears to be the first successful attempt.

Although this pattern goes a long way, I think it will be challenging to accommodate the more dynamic use cases of multiple dispatch. For example, a very common way is to import a dispatched function from another file and to extend it with another method. I've played around a bit, but I've not yet been able to make mypy happy in such a scenario. But perhaps this might be possible with a bit more work...

gabrieldemarmiesse commented 1 year ago

I think we can go there progressively. Document and implement what works now, and we can always support more use cases later on

machow commented 1 year ago

I'm working on a couple tools using plum (e.g. https://github.com/machow/quartodoc, which implements a visitor using @dispatch), and they benefit a lot from type hints being picked up by static tools, so can experiment with this more!

RE

I think it will be challenging to accommodate the more dynamic use cases of multiple dispatch. For example, a very common way is to import a dispatched function from another file and to extend it with another method.

AFAICT the only way guaranteed across static tools is to add the overload stubs for cases registered in other packages back into where you define the generic function. I think you should be able to do this without actually importing the other packages (using e.g. typing.TYPE_CHECKING).

I think there's a relevant issue on improving the mypy extension for functools.singledispatch, but the main concern there is that you can override registrations (e.g. use @my_func.register(int) in multiple places; see this comment). Another issue, is that afaik there isn't really a similar workaround for pyright.

I'm a big fan of this pattern though, so would love to find a way to make it work across packages 😭

wesselb commented 1 year ago

@gabrieldemarmiesse Alright, that sounds reasonable! Let's start out with a module plum.overload, which would work like in the previous example.

@machow, quartodoc looks super neat!! If you think quartodoc would be a good place to try this pattern, to see if it would play nice with a linter, then that would be a fantastic experiment.

Thanks for linking the mypy thread. The concerns there are very reasonable and I largely agree. I suspect that if one wants mypy and type checkers to truly play nice with multiple dispatch, then that would require rather fundamental changes to these programs. In this light, it is good to identify and collect patterns that are supported, such the one by @machow and @wch.

wesselb commented 1 year ago

I've taking a stab at adding limited type checking support in #93.

githubpsyche commented 1 year ago

Hi, I'm here because I noticed the new page in the documentation. It's more than a little mysterious on first and second read what the overload decorator is doing in the provided example, where specialized code should go, and what limitations there might be to the pattern. For example, does the overload pattern work for multi-argument functions?

I feel like a fuller code example or fuller explanation of how plum.overload works would help a lot with accessibility.

wesselb commented 1 year ago

Hey @githubpsyche! Thanks for the feedback. It would certainly be possible to elaborate on how the overload pattern works and to expand the code example to multiple arguments. I’ll soon put something together. :)

cjalmeida commented 1 year ago

Hi @wesselb, first thanks very much for plum! A few years ago I had the urge to write my own configuration library and as plum was on it's early stages, I ended up writing my own multiple-dispatch functionality heavily inspired by plum. Recently, I replaced my own baked solution with plum quite effortlessly.

Now back to the topic, the use of multiple dispatch is so fundamental to how type checkers works don't you think it's worth it proposing a PEP? In Python 3.11, a number of changes to PEP-484 were made so as long it's quite self-contained (eg. not proposing full multiple-dispatch support, just adding more flexibility to @overload semantics) it should be palatable to core-devs, justify upstream changes to mypy/pyright and avoid a number of hacks.

As for rationale (besides plum):

Major scientific libraries such as numpy and PyTorch already hack their way into implementing multiple-dispatch and spend time maintaining @overloads definitions. Stdlib already has functools.singledispatch but with very poor developer ergonomics. The PEP would propose allowing libraries to more cleanly implement dispatch semantics without having to maintain unneeded @overload typesheds.

This would allow at least a cleaner and mypy compliant use of functools.singledispatch (pending implementation changes). Using the same example from stdlib docs

from functools import singledispatch
from typing import Union

@singledispatch
def fun(arg, verbose=False):
    if verbose:
        print("Let me just say,", end=" ")
    print(arg)

@singledispatch
def fun(arg: int, verbose=False):
    if verbose:
        print("Strength in numbers, eh?", end=" ")
    print(arg)

@singledispatch
def fun(arg: list, verbose=False):
    if verbose:
        print("Enumerate this:")
    for i, elem in enumerate(arg):
        print(i, elem)

@singledispatch
def fun(arg: int | float, verbose=False):
    if verbose:
        print("Strength in numbers, eh?", end=" ")
    print(arg)

@singledispatch
def fun(arg: Union[list, set], verbose=False):
    if verbose:
        print("Enumerate this:")
    for i, elem in enumerate(arg):
        print(i, elem)

As for implementation, similar to typing.get_overloads, we could have a set_overload(...) where @overload-replacing decorators could call and register a new overload for type-checkers.

gabrieldemarmiesse commented 1 year ago

I made a draft for a PEP :) It's very early stages, I'm looking for feedback! PEP: https://github.com/gabrieldemarmiesse/PEP-draft-multiple-dispatcher/tree/master Discussion: https://discuss.python.org/t/multiple-dispatch-based-on-typing-overload/26197/1

wesselb commented 1 year ago

@githubpsyche I've elaborated in the docs and added a more complete example. I hope things are more clear now! :)

wesselb commented 1 year ago

@cjalmeida It's super nice to hear that you managed to swap in Plum without too much trouble! :) Also very cool that you're working on gamma-config. I think configuration is a difficult problem that's far from being solved, so any innovation in that space is very welcome.

I think a PEP would be incredibly cool. @gabrieldemarmiesse, your proposal is very interesting. Perhaps it is worthwhile to list the current problems, to find consensus on what a PEP could address.

Off the top of my head, in no particular order, some problems with the current state of type checking and multiple dispatch are the following:

  1. For overload-based multiple dispatch implementations, overload methods are not intended to have an implementation. It would be a simple change to allow overload methods to have an implementation, but I'm wondering if that is a change that people would be willing to accept. An alternative would be to add typing.dispatch, which would work like typing.overload but with the semantics that typing.dispatch methods should have an implementation.

  2. For overload-based multiple dispatch implementations, it is slightly troublesome that first all overloads need to come and then the function needs to be implemented. For example, you cannot add additional overloads after the "implementation":

@overload
def f(x: int) -> int:
    return x

def f(x):
    ... # The implementation

@overload
def f(x: float) -> float:  # This is not allowed, but we would really like to do so...
    return x   
  1. Related to the above point, I'm not sure that there currently is a mypy-compliant way to import a function from another file and "extend it with additional overloads". I think this is a very necessary capability, but also drastically increases the complexity, because now one needs to ensure that the type checker is aware of all relevant overloads defined in other files and perhaps even other packages.

  2. Perhaps the biggest problem is that, to make type checking work, the type checker will need to implement multiple dispatch. Here's an example:

@overload
def add(x: int, y: Number) -> Number:
    return x + y

@overload
def add(x: Number, y: int) -> Number:
    return x + y

@overload
def add(x: int, y: int) -> int:
    return x + y

add(1, 2)

mypy will determine that all three overloads are possible. Ideally, we'd like mypy to know that the (int, int) -> int method is correct, but obviously this is the mechanism of multiple dispatch. The problem is that, by not being able to narrow down the list of possible methods using the principle of multiple dispatch, the return type will be the union of all possible return types, and that will likely be too broad to be useful. In the above example, the return type will hence be int | Number == Number, which is broader than necessary, because we'd like the return type to be int.

Reflecting on all these points, I come to two conclusions:

  1. Should we try to coerse overload into doing multiple dispatch? Could it be simpler to propose a new overload-like decorator, e.g. typing.dispatch, with relaxed semantics?

  2. For type checking to be truly functional useful, mypy will need to do multiple dispatch. I unfortunately can't see a way around this.

I'd be curious to hear your thoughts on the above points, @cjalmeida and @gabrieldemarmiesse.

gabrieldemarmiesse commented 1 year ago

@wesselb that is awesome feedback! I'll work on it and clarify the PEP. Could you post (copy) your message about the PEP here https://discuss.python.org/t/multiple-dispatch-based-on-typing-overload/26197 if you don't mind? When making a PEP, the normal process is to discuss it in discuss.python.org. I don't want to spread the discussion across multiple forums :)

wesselb commented 1 year ago

@gabrieldemarmiesse definitely! I've posted my message on the thread. Thanks :)