Improve help() by making typing.overload() information accessible at runtime

rhettinger commented 3 years ago

BPO	45100
Nosy	@gvanrossum, @rhettinger, @ronaldoussoren, @JelleZijlstra, @TeamSpen210, @sobolevn, @Fidget-Spinner, @AlexWaygood, @DiddiLeija
PRs	python/cpython#31716

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.11'] title = 'Improve help() by making typing.overload() information accessible at runtime' updated_at = user = 'https://github.com/rhettinger' ``` bugs.python.org fields: ```python activity = actor = 'gvanrossum' assignee = 'none' closed = False closed_date = None closer = None components = [] creation = creator = 'rhettinger' dependencies = [] files = [] hgrepos = [] issue_num = 45100 keywords = ['patch'] message_count = 14.0 messages = ['401052', '401074', '401080', '401094', '401097', '413677', '413678', '413684', '413685', '413695', '413751', '414561', '414566', '416226'] nosy_count = 9.0 nosy_names = ['gvanrossum', 'rhettinger', 'ronaldoussoren', 'JelleZijlstra', 'Spencer Brown', 'sobolevn', 'kj', 'AlexWaygood', 'DiddiLeija'] pr_nums = ['31716'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue45100' versions = ['Python 3.11'] ```

rhettinger commented 3 years ago

Python's help() function does not display overloaded function signatures.

For example, this code:

    from typing import Union

    class Smudge(str):

        @overload
        def __getitem__(self, index: int) -> str:
            ...

        @overload
        def __getitem__(self, index: slice) -> 'Smudge':
            ...

        def __getitem__(self, index: Union[int, slice]) -> Union[str, 'Smudge']:
            'Return a smudged character or characters.' 
            if isinstance(index, slice):
                start, stop, step = index.indices(len(self))
                values = [self[i] for i in range(start, stop, step)]
                return Smudge(''.join(values))
            c = super().__getitem__(index)
            return chr(ord(c) ^ 1)

Currently gives this help:

__getitem__(self, index: Union[int, slice]) -> Union[str, ForwardRef('Smudge')]
    Return a smudged character or characters.

What is desired is:

__getitem__(self, index: int) -> str
__getitem__(self, index: slice) -> ForwardRef('Smudge')
    Return a smudged character or characters.

The overload() decorator is sufficient for informing a static type checker but insufficient for informing a user or editing tool.

ronaldoussoren commented 3 years ago

I agree that this would be nice to have, but wonder how help() could access that information. The two @overload definitions will be overwritten by the non-overload one at runtime, and hence will ever been seen by help().

AlexWaygood commented 3 years ago

There is a similar issue with functools.singledispatch

>>> from functools import singledispatch
>>> @singledispatch
... def flip(x: str) -> int:
...     """Signature when given a string"""
...     return int(x)
... 
>>> @flip.register
... def _(x: int) -> str:
...     """Signature when given an int"""
...     return str(x)
... 
>>> flip(5)
'5'
>>> flip('5')
5
>>> help(flip)
Help on function flip in module __main__:
flip(x: str) -> int
    Signature when given a string

rhettinger commented 3 years ago

The two @overload definitions will be overwritten by the non-overload one at runtime, and hence will ever been seen by help().

We can fix this by adding an __overloads__ attribute. The overload decorator can accumulate the chain in an external namespace and function creation can move that accumulation into the new attribute.

----- Proof of concept -----

from typing import Union, _overload_dummy

def create_function(func):
    namespace = func.__globals__
    key = f'__overload__{func.__qualname__}__'
    func.__overloads__ = namespace.pop(key, [])
    return func

def overload(func):
    namespace = func.__globals__
    key = f'__overload__{func.__qualname__}__'
    namespace[key] = func.__overloads__ + [func.__annotations__]
    return _overload_dummy

class Smudge(str):

    @overload
    @create_function
    def __getitem__(self, index: int) -> str:
        ...

    @overload
    @create_function
    def __getitem__(self, index: slice) -> 'Smudge':
        ...

    @create_function
    def __getitem__(self, index: Union[int, slice]) -> Union[str, 'Smudge']:
        'Return a smudged character or characters.' 
        if isinstance(index, slice):
            start, stop, step = index.indices(len(self))
            values = [self[i] for i in range(start, stop, step)]
            return Smudge(''.join(values))
        c = super().__getitem__(index)
        return chr(ord(c) ^ 1)

    @create_function
    def other_method(self, x:str) -> tuple:
        pass

print(f'{Smudge.__getitem__.__annotations__=}')
print(f'{Smudge.__getitem__.__overloads__=}')
print(f'{Smudge.other_method.__annotations__=}') 
print(f'{Smudge.other_method.__overloads__=}')

rhettinger commented 3 years ago

Note, I'm not proposing a createfunction() decorator. That is just for the proof of concept. The actual logic would go into normal function creation, the same place that \_annotations__ gets added.

Also, there may be a better place than func.__globals__ to accumulate the overloads. For the proof-of-concept, it was just the easiest way to go.

JelleZijlstra commented 2 years ago

I made a similar suggestion in bpo-46821 (thanks Alex for pointing me to this older issue):

Currently, the implementation of @overload (https://github.com/python/cpython/blob/59585d6b2ea50d7bc3a9b336da5bde61367f527c/Lib/typing.py#L2211) simply returns a dummy function and throws away the decorated function. This makes it virtually impossible for type checkers using the runtime function object to find overloads specified at runtime.

In pyanalyze, I worked around this by providing a custom @overload decorator, working something like this:

_overloads: dict[str, list[Callable]] = {}

def _get_key(func: Callable) -> str:
    return f"{func.__module__}.{func.__qualname__}"

def overload(func):
    key = _get_key(func)
    _overloads.setdefault(key, []).append(func)
    return _overload_dummy

def get_overloads_for(func):
    key = _get_key(func)
    return _overloads.get(key, [])

A full implementation will need more error handling.

I'd like to add something like this to typing.py so that other tools can also use this information.

---

With my suggested solution, help() would need to call typing.get_overloads_for() to get any overloads for the function. Unlike Raymond's suggestion, we would not need to change the function creation machinery.

gvanrossum commented 2 years ago

Sounds good to me. (I don’t care what happens at runtime but I want to support the folks who do.)-- --Guido (mobile)

8620fd7d-208a-4c5d-884a-ece73071c881 commented 2 years ago

I'm not sure a get_overloads() function potentially called after the fact would fully work - there's the tricky case of nested functions, where the overload list would need to be somehow cleared to ensure every instantiation doesn't endlessly append to the same list. It's probably also desirable to weakref it (or make it an attribute) so they can be decrefed if the function isn't being used.

JelleZijlstra commented 2 years ago

I'm OK with not fully supporting overloads created in nested functions; that's a pretty marginal use case. But it's true that my proposed implementation would create a memory leak if someone does do that. I don't immediately see a way to fix that with weakrefs. Maybe we need to put something in the defining namespace, as Raymond suggested.

8620fd7d-208a-4c5d-884a-ece73071c881 commented 2 years ago

Had a potential thought. Since the only situation we care about is overload being used on function definitions in lexical order, valid calls are only that on definitions with ascending co_firstlineno counts. Expanding on Jelle's solution, the overload() decorator could compare the current function's line number to the first in the list, and if it's \<= clear out the list (we're re-defining). Then repeated re-definitions wouldn't duplicate overloads.

The other change I'd suggest is to make get_overloadsfor() first check \_overloads__, then only if not present pop from the _overloads dict and assign to that attribute. That way if code calls get_overloads_for() at least once, the function will be referring to the actual overloads created at the same time. It'd also get garbage collected then when the function dies. It also means you could manually assign to add overloads to any callable.

AlexWaygood commented 2 years ago

I'd dearly like better introspection tools for functions decorated with @overload, but I'd rather have a solution where:

inspect.signature doesn't have to import typing. That doesn't feel worth it for users who aren't using typing.overload, but inspect.signature would have to import typing whether or not @overload was being used, in order to *check* whether @overload was being used.
The solution could be reused by, and generalised to, other kinds of functions that have multiple signatures.

If we create an __overloads__ dunder that stored the signatures of multi-signature functions, as Raymond suggests, inspect.signature could check that dunder to examine whether the function is a multi-dispatch signature, and change its representation of the function accordingly. This kind of solution could be easily reused by other parts of the stdlib, like @functools.singledispatch, and by third-party packages such as plum-dispatch, multipledispatch, and Nikita's dry-python/classes library.

So, while it would undoubtedly be more complex to implement, I much prefer Raymond's suggested solution.

JelleZijlstra commented 2 years ago

We could make my proposed overload registry more reusable by putting it in a different module, probably functools. (Another candidate is inspect, but inspect.py imports functools.py, so that would make it difficult to use the registry for functools.singledispatch.)

We could then bill it as a "variant registry", with an API like this:

def register_variant(key: str, variant: Callable) -> None: ...
def get_variants(key: str) -> list[Callable]: ...
def get_key_for_callable(callable: Callable) -> str | None: ...

@overload could then call register_variant() to register each overload, and code that wants a list of overloads (pydoc, inspect.signature, runtime type checkers) could call get_variants().

get_key_forcallable() essentially does f"{callable.\_qualname}.{callable.__name}", but returns None for objects it can't handle. It will also support at least classmethods and staticmethods.

I will prepare a PR implementing this idea.

AlexWaygood commented 2 years ago

The latest plan sounds good to me. I have some Thoughts on the proposed API, but it will be easier to express those as part of a PR review. Looking forward to seeing the PR!

gvanrossum commented 2 years ago

Looks like there may be a new plan where we solve a smaller problem (overloads) in the context of typing only.

AlexWaygood commented 2 years ago

After #31716 is merged, I'd like to have a stab at writing a PR to have overloads shown in the output of help(). For now, it will probably be easiest to do that without tinkering with inspect.signature(), and instead only making changes to pydoc -- I have a few ideas of how to do that.

AlexWaygood commented 2 years ago

I was just playing around with the new get_overloads function, and it appears that it doesn't work if for functions defined in the interactive shell. This seems unfortunate -- it certainly took me by surprise.

C:\Users\alexw\coding\cpython>python
Running Debug|x64 interpreter...
Python 3.11.0a7+ (main, Apr 14 2022, 10:41:31) [MSC v.1931 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import *
>>> @overload
... def foo(arg: int) -> str: ...
...
>>> @overload
... def foo(arg: str) -> int: ...
...
>>> def foo(arg: int | str) -> str | int: ...
...
>>> get_overloads(foo)
[<function foo at 0x000001D030B0C9F0>]
>>> import inspect
>>> inspect.signature(_[0])
<Signature (arg: str) -> int>

AlexWaygood commented 2 years ago

This is because all functions defined in the interactive shell have the same line-number set on their code object, and the new overloads registry uses the co_firstlineno attribute of the code attribute to register overloads.

>>> def foo(): ...
...
>>> def bar(): ...
...
>>> foo.__code__.co_firstlineno == bar.__code__.co_firstlineno == 1
True

I initially thought that this would cause problems for using get_overloads to improve the output of help(). Maybe it's not so much of a problem, though, since most people won't be calling help() on functions defined in the interactive shell -- they'll be calling it on functions that they import from within the interactive shell. Moreover, using @overload for a function defined in the interactive shell is pretty weird, since a type checker isn't much use to you in the interactive shell.

JelleZijlstra commented 2 years ago

It's a little unfortunate because it makes it harder for people to experiment with get_overloads() in the REPL. But I agree that it's not a big problem for people using help() to introspect library functions.

gvanrossum commented 2 years ago

Since @overload is a static typing feature, why would you be typing it in the REPL in the first place?

JelleZijlstra commented 2 years ago

Since @overload is a static typing feature, why would you be typing it in the REPL in the first place?

If you read about get_overloads() in the What's New and want to try it out for yourself, you might write some overloads in the REPL to see it in action. Looks like that's essentially what happened to Alex.

AlexWaygood commented 2 years ago

Since @overload is a static typing feature, why would you be typing it in the REPL in the first place?

Yeah, this is why I think it might actuallly not be that much of a big deal, as I said above.

If you read about get_overloads() in the What's New and want to try it out for yourself, you might write some overloads in the REPL to see it in action. Looks like that's essentially what happened to Alex.

Yup. Experimenting with stuff in the REPL is generally a part of my process when I write patches for CPython. I was tinkering with modifying pydoc to use get_overloads, and was trying things out in the REPL. I was surprised when things didn't work as expected, and it took me a minute or two to figure out that the reason for the unexpected behaviour was that I was in the interactive shell.

furkanonder commented 1 year ago

@JelleZijlstra The issue seems to be solved. We can close the issue.

JelleZijlstra commented 1 year ago

We added typing.get_overloads, but I don't think we ended up making help() look at this information yet.

python / cpython

Improve help() by making typing.overload() information accessible at runtime #89263