python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.18k stars 2.77k forks source link

Respecting return type of __new__ does not extend to derived classes #8330

Open ikonst opened 4 years ago

ikonst commented 4 years ago

Following #7188, we can now have:

from typing import overload
from typing_extensions import Literal

class Cat:
    @overload
    def __new__(cls, name: Literal['Tabby']) -> 'Tabby': ...
    @overload
    def __new__(cls, name: Literal['Kitty']) -> 'Kitty': ...

    def __new__(cls, name: str) -> 'Cat': ...

class Kitty(Cat):
    pass

class Tabby(Cat):
    pass

reveal_type(Cat('Kitty'))
reveal_type(Cat('Tabby'))

result in:

Revealed type is 'test.Kitty'
Revealed type is 'test.Tabby'

(Meant as a trivial example.)

However, this doesn't extend to derived classes:

reveal_type(Tabby('Kitty'))
reveal_type(Tabby('Tabby'))

would result in:

Revealed type is 'test.Tabby'
Revealed type is 'test.Tabby'

I'm using mypy 0.761.

ilevkivskyi commented 4 years ago

This is actually intentional, because doing the opposite (as it was initially implemented) caused even more false positives.

I am not sure what to do here, cc @msullivan

ikonst commented 4 years ago

Curious, point to discussion / false positives?

sbdchd commented 3 years ago

I think I'm hitting this issue when trying to type an overloaded __new__ in a base class

My attempt overloading __new__ to match up with the untyped source:

from __future__ import annotations
from typing import Any, Dict, List, Type, TypeVar, overload
from typing_extensions import Literal

T = TypeVar("T", bound="MiniBaseSerializer")

class MiniBaseSerializer:
    @overload
    def __new__(cls, *, many: Literal[True]) -> MiniListSerializer:
        ...

    @overload
    def __new__(cls: Type[T], *, many: Literal[False] = ...) -> T:
        ...

    def __new__(cls: Type[T], many: bool = ...) -> Any:
        ...

class MiniListSerializer(MiniBaseSerializer):
    @property
    def validated_data(self) -> List[Dict[str, object]]:
        ...

class MiniSerializer(MiniBaseSerializer):
    @property
    def validated_data(self) -> Dict[str, object]:
        ...

class ExampleSerializer(MiniSerializer):
    # define some fields
    ...

example_many = ExampleSerializer(many=True)
reveal_type(example_many)
# pyright: Type of "example_many" is "MiniListSerializer"
# mypy: Revealed type is 'test_foo.ExampleSerializer'

example_single = ExampleSerializer()
reveal_type(example_single)
# pyright: Type of "example_many" is "MiniListSerializer"
# mypy: Revealed type is 'test_foo.ExampleSerializer*'

It works in pyright, but mypy forgets about the __new__ in the base class

I then tried overloading __init__ and adding a generic, which sort of works, but would require any users to provide a generic type param and it is also kind of lying about what is happening since the actual source code is returning a different type by overloading __new__

from __future__ import annotations
from typing import Dict, Generic, List, TypeVar, overload
from typing_extensions import Literal

T = TypeVar("T")

class MiniBaseSerializer(Generic[T]):
    @overload
    def __init__(self: MiniBaseSerializer[List[Dict[str, object]]], many: Literal[True]) -> None:
        ...

    @overload
    def __init__(self: MiniBaseSerializer[Dict[str, object]], many: Literal[False] = ...) -> None:
        ...

    def __init__(self, many: bool = False) -> None:
        ...

    @property
    def validated_data(self: MiniBaseSerializer[T]) -> T:
        ...

class MiniSerializer(Generic[T], MiniBaseSerializer[T]):
    ...

class ExampleSerializer(MiniSerializer[object]):
    ...

example_many = MiniBaseSerializer(many=True)
reveal_type(example_many)
# pyright - Type of "example_many" is "MiniBaseSerializer[List[Dict[str, object]]]"
# mypy - Revealed type is 'test_bar.MiniBaseSerializer[builtins.list[builtins.dict[builtins.str, builtins.object]]]'

example_single = MiniBaseSerializer()
reveal_type(example_single)
# pyright - Type of "example_single" is "MiniBaseSerializer[Dict[str, object]]"
# mypy - Revealed type is 'test_bar.MiniBaseSerializer[builtins.dict[builtins.str, builtins.object]]'

In search the issues, I also found https://github.com/python/mypy/issues/9482 which seemed related

Edit: Additionally, in the linked PR, there is:

https://github.com/python/mypy/blob/30efa34961e44675ccc405cb36b31745ef309660/test-data/unit/check-classes.test#L6276-L6288

which has the subclass maintaining the __new__ defined on the superclass

Viicos commented 1 year ago

As stated in https://github.com/typeddjango/djangorestframework-stubs/pull/315, django-rest-framework is a good example that would benefit from the requested behavior.

I'm curious to hear about the false positives that this caused

intgr commented 1 year ago

Finally found the PR and issue that introduced this restriction:

Since then, subprocess.Popen in typeshed no longer defines __new__ which solves the concrete example in #7597 AFAICT.

Also with PEP 673 Self type freshly merged in #11871, it should be easier to declare __new__ methods that correctly apply to subclasses as def __new__(...) -> Self, removing another reason for this restriction.

Returning non-subclasses from __new__() is a hack for sure, but as explained in https://github.com/python/mypy/issues/8330#issuecomment-876872608 and https://github.com/python/mypy/issues/14426 there are real-world libraries like Django REST Framework which are using this hack. And this restriction prevents accurately type hinting this library.

ikonst commented 1 year ago

Thanks for the thorough code archeology, @intgr!

I suppose we can try adjusting it once more and seeing what breaks in mypy-primer etc.

Viicos commented 1 year ago

I've tried reverting this specific check (https://github.com/Viicos/mypy/commit/016abc352090515ab28dafdb44f4be92c0c3fcac), but according to the mypy_primer output, this seems to break a lot of things (my fix is probably wrong)

intgr commented 1 year ago

I've tried reverting this specific check

Maybe you need to replace it with logic that retains the old behavior if __new__ return type was Any (e.g. untyped).

Where can I see the mypy_primer results?

Viicos commented 1 year ago

I ran mypy_primer locally, but I'll make the PR so that we can have the output

DougLeonard commented 1 year ago

Returning non-subclasses from \_\_new\_\_() is a hack for sure, but as explained in #8330 (comment) and #14426 there are real-world libraries like Django REST Framework which are using this hack. And this restriction prevents accurately type hinting this library.

Should it even matter if something is a hack or matter if it's been proven to be useful? Wouldn't that be a discussion for the language and/or typing standards (or possibly linter) more than for the implementation of the reference type checker? It seems to be true that arbitrary return values from __new__ are legal python, and intentionally so.

Viicos commented 1 year ago

Should it even matter if something is a hack or matter if it's been proven to be useful? Wouldn't that be a discussion for the language and/or typing standards (or possibly linter) more than for the implementation of the reference type checker? It seems to be true that arbitrary return values from new are legal python, and intentionally so.

This is the same idea as stated here: https://github.com/python/mypy/issues/1020#issuecomment-1318657466, and I do agree. I haven't got time to get back to the PR, but I'll hope it will be merged when ready in the future

DougLeonard commented 1 year ago

quoting intgr from PR thread #14471:

Please make sure that if the __new__ return is untyped (returns Any) then it uses the class type implicitly. Not sure if that is already covered, but I suspect not.

I'm sure I'm not as familiar with realities, but specs are desirable. From PEP484:

Any function without annotations should be treated as having the most general type possible, or ignored, by any type checker. "

How do you get from that to restricting an explicit user-provided __new__ function to a particular type by default?

Neither mypy nor pyright are perfect, but an example:


from typing import Any

class Wolf():
    def __new__(cls):
        return object.__new__(cls)

    def howl(self):
        print("howling")

class Sheep():
    def __new__(cls) -> Any:  
        return object.__new__(Wolf)

    def bleat(self):
        print("baaah")

bigbad = Wolf()
bigbad.howl()

shifty = Sheep()
shifty.howl()

Pyright accepts the above, with (case 1) or without(case 2) the ->Any , but mypy wrongly complains about shifty howling. And really, either seems to be legal python in spec and in implementation. As I understand, intgr's proposal would reject them (edit:) and would require legal code to have explicit type annotations to pass mypy Adding a typed parameter to __new__ changes nothing in both cases.

The rest may be beyond the PR, but strongly relate:

shifty.bleat()

Case 3: Still, fine for Pyright unless you remove the Any (Case 4). Then Pyright sees the object.new(Wolf) and knows shifty cannot bleat and complains. PEP484:

For a checked function, the default annotation for arguments and for the return type is Any.

Intgr also implies that -> Any and no annotation should be treated the same, so it seems Pyright misses the spec, at least in the case where there's a typed parameter guaranteeing that it's a "checked function." What about type narrowing with no type hints at all. With no return type hints, Pyright type narrows (deduces) Sheep.__new() to Wolf. Is narrowing part of "checking"? Is it legal to do type narrowing inside an un-"checked" function? If it's possible to determine with certainty that a runtime type error will definitely__ occur, should a checker ever be implicitly required to ignore that to be standard compliant? The first quote says "most general type OR ignored." A general type can be narrowed by other information.

On the other hand, when a value has type Any (default), the type checker will allow all operations on it,

Sure but can the Any return type be narrowed before/when returning?

Another example:

class mouse():
    pass

jerry = mouse()
mouse.howl()

(Case 5) Pyright and mypy both complain this time. But mouse() is now guaranteed to be a mouse, and and we have no wrapper around object.new('T') in which to ignore checking (or maybe narrowing).

How about something more challenging?

class MayBeSheep():
    def __new__(cls):
        a = input('^')
        if a == "wolf":
            return object.__new__(Wolf)
        else:
            return object.__new__(cls)

    def bleat(self):
        print("baaah")

supershifty = MayBeSheep()
supershifty.howl()
supershifty.bleat()

(Case 6) Supershifty certainly cannot both bleat and howl. No matter what, this a type error. Mypy as always just ignores our new and only complains about howl(). It would be happy if we only tried to bleat. Pyright again behaves differently with no return type (case 6) than with -> Any. With no return hint, (edit: it seems to assume a union type) and complains about both bleating and howling, referencing the respective problematic type for each. With -> Any (Case 7) it relents and gives no error. Should the first access be accepted but treated as narrowing, only creating an error on the conflicting one? Would it be different if the conditional used a random bool, or a flip flop instead of an input? Inputs can be constrained. How smart must the checker be and what assumptions can/should it make?

I find it hard to read the PEP's and have any expectations, especially correct ones, about what mypy may or may not do. Does the PEP need to be improved?

Viicos commented 1 year ago

@DougLeonard you are raising a lot of interesting questions here, and regarding the return object.__new__(Wolf) from the Sheep class, I'm wondering if mypy can infer that the return type is Wolf, as pyright is currently doing. We can't really know right now, as Wolf is not a subclass of Sheep hence why mypy ignores it (and this is what the PR is trying to fix).

Regarding the last example, maybe the type checker could infer a union of both types? My knowledge is limited here, but I can say that narrowing is part of type checking, and I'm still seeing some PRs implementing new kinds of type narrowing.

DougLeonard commented 1 year ago

@Viicos, Right but pyright is inferring Wolf even when there are no type annotations at all. Should it? You said narrowing is part of checking. I'm not sure if it's right for me to call inferring/deducing Wolf as a kind of narrowing on the implicit -> Any, but the spec says we should not do type checking on un-annotated functions, and it seems they should just be assumed to return Any. Personally, I'd rather see that it is always at least allowable to apply deductions that are guaranteed to be true, and it seems always a desirable feature. If the spec presently says that such deductions/narrowing are actually illegal to do on un-annotated functions, then maybe the spec should be changed. But I'm not sure I'd say (from a naive reading) that the wording must mean that no checking means no deducing. Anyway, presently mypy is of course deducing anyway, but is deducing wrongly.

For the last example pyright seems to infer a union (when ->Any isn't explicit), as it does complain about each of the types respectively. Yes, I think that's pretty good.

erictraut commented 1 year ago

For details about areas where pyright intentionally differs from mypy, you may find this documentation interesting.

DougLeonard commented 1 year ago

Thanks. I've seen it. And of course pyright can do whatever pyright wants to do. I believe mypy is meant to follow spec, but then neither really do.

ikonst commented 1 year ago

@erictraut, a very exhaustive list. Can recognize some mypy pet peeves right there. BTW,

S in D and S not in D (where S is a string literal and D is a final TypedDict)

should no longer be the case since #13838 😃