python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.2k stars 2.78k forks source link

Honor return type of `__new__` even if not a subclass #15182

Open oscarbenjamin opened 1 year ago

oscarbenjamin commented 1 year ago

This follows gh-1020 which was closed by gh-7188. This arises because I am adding type hints to sympy (https://github.com/sympy/sympy/pull/25103).

In the following code we have a subclass whose __new__ method is only guaranteed to return objects that are types of the superclass:

from __future__ import annotations

class A:
    def __new__(cls) -> A:
        return super().__new__(cls)

class B(A):
    def __new__(cls) -> A:
        return super().__new__(cls)

reveal_type(B())

With mypy this is rejected and the type of B() is inferred incorrectly but pyright accepts this and gets the inference for B() correct:

$ mypy q.py
q.py:8: error: Incompatible return type for "__new__" (returns "A", but must return a subtype of "B")  [misc]
q.py:11: note: Revealed type is "q.B"
Found 1 error in 1 file (checked 1 source file)
$ pyright q.py
...
  ./q.py:11:13 - information: Type of "B()" is "A"
0 errors, 0 warnings, 1 information 
Completed in 0.955sec

The fix for __new__ in gh-7188 was:

The issue here is about mixing the last two points. Here mypy produces an error but then does not respect __new__'s return type. The return type is not respected since mypy infers that it is of type B unlike pyright which infers type A as specified by the hint. I don't mind mypy reporting an error here but in context -> A is the accurate type hint and mypy should respect that because anything else is just not correct. I can add a type ignore to silence the mypy error but the inference will still be wrong in all downstream/user code and is also inconsistent with pyright.

oscarbenjamin commented 1 year ago

This is handled here where there is an explicit check to ignore the type given in __new__ if it is not a subtype of the class and also in a bunch of other cases: https://github.com/python/mypy/blob/13f35ad0915e70c2c299e2eb308968c86117132d/mypy/typeops.py#L184-L196 I'm not sure where all of those conditions come from but I don't see how any of them can override the hint given in __new__ because __new__ can return anything (this is generally the reason for using __new__ rather than __init__).

Viicos commented 1 year ago

Hi @oscarbenjamin, this seems to be a duplicate of https://github.com/python/mypy/issues/8330. I've opened https://github.com/python/mypy/pull/14471 since then but I'm kind of stuck on it. Hopefully if someone with the required knowledge could see what's wrong in the PR we could get it merged one day!

oscarbenjamin commented 1 year ago

this seems to be a duplicate of #8330

Yes, I saw that issue. I thought this was different but now that I've seen the code that causes this I can see that it causes both issues.

To be honest having looked at the code I don't really understand what any of it is doing. It doesn't bear any resemblance to Python's runtime semantics for type.__call__, __new__ and __init__. For example the code here chooses between taking a hint from __new__ or __init__ based on which comes earlier in the MRO: https://github.com/python/mypy/blob/13f35ad0915e70c2c299e2eb308968c86117132d/mypy/checkmember.py#L1198-L1199 That makes no sense to me:

I would expect the inference to work something like this. We have a class:

class A(B, metaclass=Meta):
    def __new__(cls, *args, **kwargs):
        ...
    def __init__(self, *args, **kwargs):
       ....

When we call A(*args, **kwargs) it will return Meta.__call__(A, *args, **kwargs) so we should determine what type that would return. By default Meta is type whose call method looks like:

class type:
    def __call__(A, *args, **kwargs):
        # Use __new__ to create the object
        if A.__new__ is object.__new__:
            obj = object.__new__(A)
        else:
            obj = A.__new__(A, *args, **kwargs)

        # Call __init__ with the created object (only if is an instance of A):
        if isinstance(obj, A):
            if type(obj).__init__ is not object.__init__:
                obj.__init__(*args, **kwargs)

        # Return the object obtained from __new__
        return obj

The final piece to know is that object.__new__(A) returns an object of type A. The call to A.__new__(A, ...) needs further analysis but usually it will eventually come down to object.__new__(A) (via calls like B.__new__(A) along the way).

We see here that the type of the object returned is just the return type of A.__new__(). If A.__new__() returns an object that is not an instance of A then A.__init__() will never be called meaning any hints in A.__init__() are irrelevant. Even if A.__init__() is called it does not change the type of obj (unless it reassigns obj.__class__ but that is very obscure).

NeilGirdhar commented 9 months ago

Respecting the metaclass' __call__ return type is required to type check nnx, which makes extensive use of this feature here.

Viicos commented 9 months ago

Respecting the metaclass' __call__ return type is required to type check nnx, which makes extensive use of this feature here.

As a workaround, I'm using the __new__ method instead (see https://github.com/python/mypy/pull/16020#issuecomment-1761244557, where I'm hitting the same use case as you)

jorenham commented 5 months ago

This currently makes it impossible to correctly type builtins.reversed. See https://github.com/python/typeshed/issues/11645 and https://github.com/python/typeshed/pull/11646

jorenham commented 5 months ago

Like @oscarbenjamin noted, this fails in the contravariant case, i.e. when __new__ returns an instance of its supertype.

But when __new__ is annotated to return something else entirely, mypy will simply ignore it, even if annotated explicitly:

class Other:
    def __new__(cls, arg: T, /) -> T:  # error: "__new__" must return a class instance (got "T")  [misc]
        return arg

assert_type(Other(42), 'int')  # error: Expression is of type "Other", not "int"  [assert-type]

For details, see https://mypy-play.net/?mypy=latest&python=3.12&flags=strict&gist=336602d390b5e6566e9fca93d7fa48a6

Viicos commented 5 months ago

The PR fixing this can be found here: https://github.com/python/mypy/pull/16020

jorenham commented 2 months ago

@Viicos #16020 does not fix this issue. In the PR description it explicit states:

We avoid fixing https://github.com/python/mypy/issues/15182 for now.

jorenham commented 2 months ago

The official typing specs now include a section on the __new__ method:

For most classes, the return type for the __new__ method is typically Self, but other types are also allowed. For example, the __new__ method may return an instance of a subclass or an instance of some completely unrelated class.

So this confirms that this is, in fact, a bug (and IMO a rather big one).

oscarbenjamin commented 2 months ago

The relevant parts of the code have not changed since I linked them above. Within mypy __new__ is treated as an alternate version of __init__. There is no analysis of the sequence I described above (which matches the updated typing spec):

Having seen the code it is not surprising that other examples don't work as well (gh-14122):

class A:
    pass

class Meta(type):
    def __call__(self, *args, **kwargs) -> A:
        return A()

class B(metaclass=Meta):
    pass

print(type(B())) # A
reveal_type(B()) # B
Viicos commented 2 months ago

@Viicos #16020 does not fix this issue. In the PR description it explicit states:

We avoid fixing #15182 for now.

Looking at the (updated) test cases, feels like it does? In particular:

https://github.com/python/mypy/blob/cdccffa7c009988a3f1677328965fa35e066889b/test-data/unit/check-classes.test#L6925-L6931

While a type ignore comment is needed, int is not a subclass of A and it type checks as expected.

jorenham commented 2 months ago

@Viicos Ah good, that's better than nothing I guess. But it still doesn't constitute a full fix, because that error is a false negative. I also wonder if it'll still "work" when __new__ is overloadaed.

oscarbenjamin commented 2 months ago

16020 does not fix this issue.

Looking at the (updated) test cases, feels like it does?

Maybe in some cases but likely not others. As noted in the comments the code uses a "hack" to make this work for some cases. It begins by checking is_new which I believe is already incorrect because __new__ and __init__ are searched in a "combined MRO" as noted above.

I would expect that it still fails for other cases like this:

from __future__ import annotations

class A:
    def __new__(cls) -> A:
        return super().__new__(cls)

class B(A):
    def __new__(cls) -> A:
        return super().__new__(cls)

class C(B):
    def __init__(self): # possibly use self:C here.
        pass

# C.__init__ is earlier in the "combined MRO" than B.__new__

reveal_type(B()) # should be A
reveal_type(C()) # should also be A

I think the overall logic in type_object_type and other places needs a rethink from first principles to be able to handle all cases. The presence of the is_new variable implies a confusion that __new__ and __init__ are somehow interchangeable so anything derived from it is likely incorrect for some cases.