@overload outside stubs

pludemann commented 9 years ago

PEP 484 says that @overload is only allowed in stubs.

But if I have a function like this:

    def plus1(x):
        return x + 1

then the most precise type is an overload:

    def plus1(x: int) -> int
    def plus1(x: float) -> float
    def plus1(x: complex) -> complex

This particular case could, of course, be handled by

    Number = TypeVar('Number', int, float, complex)
    def f(x: Number) -> Number

But I can easily construct other examples that can't be solved by type variables.

I don't think that @overload requires any additional dispatch machinery in Python (despite what the PEP says) -- it merely describes what the function can do in a more precise way than this:

    def plus1(x: Union[int,float,complex]) -> Union[int,float,complex]

gvanrossum commented 9 years ago

I don't understand your claim. Can you show me exactly what the implementation of plus1() should look like? Where do I put the @overload?

pludemann commented 9 years ago

So the problem with @overload outside of stubs is purely one of syntax? (Which I admit looks to be a rather nasty problem if the goal is to stay within the current Python3 syntax)

(I've fixed the markup ... github's markdown docs don't tell the truth)

gvanrossum commented 9 years ago

Yes, the problem is exactly to stay within the current Python3 syntax.

(You can preview the markup easily by clicking on the Preview tab.)

pludemann commented 9 years ago

How about this?

    @overload
    def foo(x: int, y: int) -> int: pass

    @overload
    def foo(x: str, y: str) -> str: pass

    def foo(x, y):  # The actual definition
        return x + y

This allows as many @overload as desired; the type checker should insist that the actual definition exists and has no type annotations. There's a slight cost of the stubs, but probably acceptable. Or the stubs could be put into a separate .pyi if load time is a concern.

This technique can also be applied to any stub annotation, so for example (if we want to support exceptions):

    def bar(x: MyType) -> AnotherType:
        raise Union[SomeError, AnotherError]

    def bar(x):  # The actual definition
       ....

To avoid mistakes, we might want to have a @stub decorator for stubs that aren't overloaded.

gvanrossum commented 9 years ago

Well, if we allow @overload in the actual implementation module, it should let you put the implementation for each overloaded version in that @overload. Not only is this the way overloading works in all other languages that have it, but this also often leads to clearer code if the signatures are not so similar, e.g.

class bytes:
    @overload
    def __getitem__(self, i: int) -> int:
        ...return byte at index i...
    @overload
    def __getitem__(self, s: slice) -> bytes:
        ...return a slice of bytes directed by s...

But making this work with the given syntax (@overload) is hard and would require using sys._getframe(). So instead of holding up the PEP while we figure out how to implement that or what a more easily implementable syntax for it would be, I'm proposing to allow @overload only in stub files.

Is there any particular reason you want @overload to work outside stub files?

pludemann commented 9 years ago

You seem to be describing @overload as a multi-dispatch whereas I'm thinking of it only as a type annotation. I have no problem with multi-dispatch, but I think that's out-of-scope for PEP 484.

My only point is that it's trivial to define an overload decorator that allows putting all the annotations with the definition, rather than requiring that @overload annotations go into a separate .pyi "stub" file but non-@overload annotations can go into either the .py file or the .pyi file. (This would be just a "no-op" decorator, which the type-checker could treat specially — no need for sys._getframe() craziness because the type-checker is working with the AST anyway.)

It's a matter of taste: do we want a separate "header" file (like C++ .h and .cc); or do we want everything together (like Java)? It seems that the "modern" tendency is to minimize the number of files and to not use separate header files, relying on other tools (like javadoc) to create nice documentation from the extracted information — and that's feedback I've received from some people (how representative this feedback is, I don't know). Does this break or enhance TOOWTDI? — we've already decided that we'll allow stub files, so I think it enhances TOOWTDI by not enforcing what might seem to be an arbitrary rule that applies only to @overload and instead reserving stub files for situations where the original .py file cannot be modified with type annotations ... and we might want to modify PEP8 to say that type annotations should go in the .py file by preference.

I'm not proposing that we get rid of stub files (they have legitimate reasons for existing); just pointing out that there's no reason to require @overload to go into only stub files (unless we want to preserve @overload for some future PEP that allows multiple dispatch, such as Guido's proposal for @multimethod).

gvanrossum commented 9 years ago

I think you hit the nail on the head. For inline use I would like to have a single multiple-dispatch mechanism that also acts as an overloaded type. But coming up with a good design and implementation for that is out of scope (Łukasz wants to work on it for a separate PEP). In the meantime having @overload in stubs is fine, because there it means only an overloaded type; in implementation files it would be confusing to have it as a type notation, because of expectations users might have. Also note that your particular example can be written without @overload, using TypeVar(X, int, str). My observation is that more sophisticated uses of overloading (that you cannot rewrite like that) are more common for builtins than for user-defined code, which is why having @overload in stubs is more important than having it in user code. But by the time 3.6 rolls along I assume we will have a solid proposal for multi-dispatch that can also be understood by type checkers.

pludemann commented 9 years ago

SGTM

I hope to have some answers in the next few months about what kinds of signatures we see in ordinary production code (outside the core libraries). 3.6 time-frame seems reasonable for the next iteration of the PEP.

JukkaL commented 9 years ago

I've considered the approach proposed by @pludemann (using @overload to declare an overloaded signature, with a separate function implementation), and even though inelegant, it would probably be useful at least occasionally. I think TypeScript uses a fairly similar approach.

I don't have enough data to estimate how often this would be useful in production code. All the instances of overloading in mypy codebase (from the time when overloading implied multiple dispatch) were easy to refactor to use union types or multiple functions (with different names) instead. We can always fall back to Any types if a function signature is too complex to represent otherwise.

o11c commented 9 years ago

__getitem__ is an illustrative example here.

I'm currently working around it by using a .pyi file with the correct overloads and a .py file that actually dispatches to _get_item and _get_slice methods and use C++-style CRTP refactor that out to the base class in a separate module, but this is really ugly and python isn't supposed to be.

I'm perfectly happy writing the dispatcher function myself. A rule of "a series of @overload functions must be followed by one non-@overload function" would suffice.

I do think that, in the long run (Python 4?), generics and overloaded functions are important enough language features that they should have their own syntax. But we're not there yet.

gvanrossum commented 9 years ago

The rule "a series of @overload functions must be followed by one non-@overload function" looks reasonable. @JukkaL would that be easy to implement in mypy? We could rig the runtime typing.py such that if you manage to forget the non-@overload, calling it (which will call the last @overload variant) will always fail with a clear error (probably TypeError):

def overload(func):
    @functools.wraps(func)
    def wrapper(*args, **kwds):
        raise TypeError("Called an overloaded function/method without a non-@overload fallback")
    return wrapper

JukkaL commented 9 years ago

I don't think that it would be difficult to implement in mypy -- maybe at most a day's work.

gvanrossum commented 9 years ago

I've thought this over and I don't think it's worth it. We need to wait for a working proposal for multi-dispatch first. Otherwise we'll just end up having to support this interim syntax and whatever the new multi-dispatch is. Keeping @overload restricted to stub files makes it much more tractable.

gvanrossum commented 8 years ago

This came up again in the context of Tornado. Reopening.

gvanrossum commented 8 years ago

So I am now thinking that we should implement this proposal. The Tornado utf8() function could look like this:

@overload
def utf8(value: None) -> None: ...
@overload
def utf8(value: bytes) -> bytes: ...
@overload
def utf8(value: str) -> bytes: ...  # or (unicode)->bytes, in PY2
def utf8(value):
    # Real implementation goes here.

At runtime the @overload decorator would no longer raise an exception when the decorator is run; instead it would return a dummy function that runs when the decorated function is called. I propose this:

def overload(func):
    def overload_dummy(*args, **kwds):
        raise NotImplemented("You should not call an overloaded function. "
                             "A series of @overload-decorated functions "
                             "outside a stub module should always be followed "
                             "by an implementation that is not @overloaded.")

UPDATE: That should be NotImplementedError.

gvanrossum commented 8 years ago

BTW that proposal would require a change to PEP 484 (that's possible, it's provisional) and a change to typing.py in Python 3.5.2 (that's also possible, but we may need to act somewhat quickly).

gvanrossum commented 8 years ago

I've posted a link to this issue to python-ideas, so hopefully we can move quickly if it's uncontroversial, or we'll get a better proposal soon.

Carreau commented 8 years ago

Just +1, (and nitpicking but I think you ment NotImplementedError if it goes into PEP 484.)

ncoghlan commented 8 years ago

Given the prior discussion regarding function annotations in Python 2/3 compatible code, my proposed near term workaround for this problem was to allow multiple comments in that style in order to indicate signature overloads inline:

def utf8(value):
    # type: (None) -> None
    # type: (bytes) -> bytes
    # type: (unicode) -> bytes
    ...

gvanrossum commented 8 years ago

I appreciate that you're trying to avoid fixing a syntax that we might want to change again in the future. However your proposal also has to be changed in the future. So from the POV of early adopters of whatever syntax we end up agreeing on here there's little difference -- either way they will eventually have to rewrite it.

But I still really don't like that that requires you to use the Python-2 style "fallback annotations" (in comments) even when using Python 3, nor that it's so different from the PEP 484 syntax for stubs. In the case of Tornado's utf8(), it would stick out as a sore thumb because Ben's plan there is to release a version that has inline annotations (for use with PY3 only).

As to the confusion between @overload and @functools.singledispatch`, I think that both are pretty esoteric, and people will just have to look up working examples rather than trying to guess their purpose from just looking at the name.

gvanrossum commented 8 years ago

I forgot one more thing. @overload can also be used with signatures of different length. Take this constructor for built-in range() that I found in typeshed/3/stdlib/builtins.pyi

    @overload
    def __init__(self, stop: int) -> None: ...
    @overload
    def __init__(self, start: int, stop: int, step: int = 1) -> None: ...

Using my proposal we could move this to a .py file and add an implementation, like this:

    def __init__(self, *args):
        if len(args) == 1: ...
        elif len(args) == 2: ...
        elif len(args) == 3: ...
        else: raise TypeError(...)

If we were to rewrite this using your proposal, we'd lose two things:

The arg names (giving useful hints about their meanings in various overloads) are gone from the signatures
The type checker cannot complain if the signature in a comment doesn't match the arguments in the 'def' line (or at least it can't when there are overloads)

Anyway, while I totally agree that my proposal isn't ideal (note that this issue was closed once before without action and the PEP specifically forbids overloading in .py files), I disagree that your proposal is better.

JukkaL commented 8 years ago

For a long time we weren't even sure whether this would be a useful feature to have, so the elegance and compactness of the syntax isn't as important as intuitiveness and ease of use -- most users would use this only rarely, if ever. If we'd use almost the same syntax in .py files and .pyi files usability would likely be better compared to inventing a new syntax as there would be less to learn and remember.

gvanrossum commented 8 years ago

Also, @overload can use keyword args to pick the right signature, e.g. (extreme example):

@overload
def foo(*, a: int) -> int: ...
@overload
def foo(*, b: int) -> int: ...
``

ncoghlan commented 8 years ago

The variable signature + keyword arg examples are persuasive, so I'm happy to withdraw the idea of using the comment based fallback notation.

gvanrossum commented 8 years ago

OK, let's do it!

bdarnell commented 8 years ago

So in py2-compatible mode (using comments), it would look like this?

@overload
def utf8(value):
    # type: (None) -> None
    pass
@overload
def utf8(value):
    # type: (bytes) -> bytes
    pass
@overload
def utf8(value):
   # type: (unicode_type) -> bytes
   pass
def utf8(value):

That's clunky, but it works and I can live with it for the few cases where this comes up.

One concern I have with this change to @overload is that if I start using it outside of stubs, my package will no longer work on python 3.5.0; it will require a newer release in the 3.5 series (and pypi metadata leaves me with no effective way to communicate this to users). That's an unfortunate compatibility break for a feature that is not supposed to have a runtime effect. Could the overload declarations be put in an if block so the type checker can see them but they don't actually get executed at runtime? There used to be a MYPY variable but I don't see a more generic replacement for it in PEP 484.

gvanrossum commented 8 years ago

Yes, that's what it would look like.

The 3.5.0 (and 3.5.1) problem is inherent to provisional features. But yes, it is unfortunate. AFAIK mypy doesn't mind if you have something inside "if False", it will still type-check it.

(Anyway, we haven't implemented this yet. But I'm pretty sure this is what it will look like -- the python-dev/ideas discussion seems to have settled.)

gvanrossum commented 8 years ago

If you like this we should (again) push the discussion on python-dev.

gvanrossum commented 8 years ago

I've updated the PEP (both here and in the peps repo) to describe this. The typing.py module in this repo also supports it, but I haven't pushed it to PyPI (there's a queue of things to do still). mypy doesn't support it yet, we'll track that at https://github.com/python/mypy/issues/1136 .

python / typing

@overload outside stubs #72