python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.46k stars 2.83k forks source link

Implicit type aliases not recognised with PEP 604 + import cycles #11887

Open AlexWaygood opened 2 years ago

AlexWaygood commented 2 years ago

This bug was originally surfaced by https://github.com/python/typeshed/pull/6612. It is hard to tell if it is the only bug that was revealed by that PR, or if there are other PEP 604-related bugs in mypy.


Top-line summary

If a single line in typeshed/stdlib/_typeshed/__init__.pyi is changed from using Union to PEP 604-style syntax, a large number of errors are output by mypy:

--- a/stdlib/_typeshed/__init__.pyi
+++ b/stdlib/_typeshed/__init__.pyi
@@ -189,7 +189,7 @@ ReadOnlyBuffer = bytes  # stable
 # for it. Instead we have to list the most common stdlib buffer classes in a Union.
 WriteableBuffer = Union[bytearray, memoryview, array.array[Any], mmap.mmap, ctypes._CData]  # stable
 # Same as _WriteableBuffer, but also includes read-only buffer types (like bytes).
-ReadableBuffer = Union[ReadOnlyBuffer, WriteableBuffer]  # stable
+ReadableBuffer = ReadOnlyBuffer | WriteableBuffer  # stable
Mypy errors ``` stdlib\io.pyi:44: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\io.pyi:44: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\io.pyi:54: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\io.pyi:54: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\io.pyi:61: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\io.pyi:61: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\io.pyi:74: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\io.pyi:74: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\io.pyi:104: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\io.pyi:104: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\mmap.pyi:52: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\mmap.pyi:52: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\mmap.pyi:53: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\mmap.pyi:53: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\mmap.pyi:55: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\mmap.pyi:55: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\mmap.pyi:64: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\mmap.pyi:64: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\ctypes\__init__.pyi:89: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\ctypes\__init__.pyi:89: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\builtins.pyi:672: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\builtins.pyi:672: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:9: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:9: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:10: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:10: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:11: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:11: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:23: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:23: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:24: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:24: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\struct.pyi:25: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\struct.pyi:25: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\lzma.pyi:90: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\lzma.pyi:90: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hmac.pyi:19: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hmac.pyi:19: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hmac.pyi:30: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hmac.pyi:30: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hmac.pyi:31: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hmac.pyi:31: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hmac.pyi:37: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hmac.pyi:37: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hmac.pyi:42: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hmac.pyi:42: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:12: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:12: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:16: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:16: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:19: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:19: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:20: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:20: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:21: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:21: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:22: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:22: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:23: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:23: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:24: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:24: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:25: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:25: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:49: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:49: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:56: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:56: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:60: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:60: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:70: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:70: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:72: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:72: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:89: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:89: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:92: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:92: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:93: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:93: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\hashlib.pyi:94: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\hashlib.pyi:94: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\_socket.pyi:14: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\_socket.pyi:14: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\_socket.pyi:560: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\_socket.pyi:560: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\_socket.pyi:561: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\_socket.pyi:561: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\_socket.pyi:563: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\_socket.pyi:563: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\_socket.pyi:565: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\_socket.pyi:565: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\socket.pyi:603: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\socket.pyi:603: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\gzip.pyi:136: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\gzip.pyi:136: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\bz2.pyi:127: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\bz2.pyi:127: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\bz2.pyi:128: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\bz2.pyi:128: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\ssl.pyi:343: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\ssl.pyi:343: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\ssl.pyi:344: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\ssl.pyi:344: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\ssl.pyi:346: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\ssl.pyi:346: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases stdlib\ssl.pyi:348: error: Variable "_typeshed.ReadableBuffer" is not valid as a type stdlib\ssl.pyi:348: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases ```

Investigation

This bug is hard to pin down. I have not managed to reproduce it outside of the typeshed repo.

I have managed to reproduce the bug in a branch of typeshed in which the vast majority of the stdlib has been deleted. This indicates to me that the bug has something to do with an interaction between builtins, _typeshed and mmap. In this "slimmed-down" version of typeshed, I see the following errors from mypy:

stdlib\builtins.pyi:677: error: Name "ReadableBuffer" is not defined
stdlib\io.pyi:5: error: Variable "_typeshed.ReadableBuffer" is not valid as a type
stdlib\io.pyi:5: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases

Notes on the bug

The bug can be reproduced if ReadableBuffer is defined like this in _typeshed:

import mmap
WriteableBuffer = mmap.mmap
ReadableBuffer = bytes | WriteableBuffer

However...

(1) It cannot be reproduced if ReadableBuffer is defined like this in typeshed:

import mmap
ReadableBuffer = bytes | mmap.mmap

(2) Nor can it be reproduced if ReadableBuffer is defined like this in typeshed:

from mmap import mmap
WriteableBuffer = mmap
ReadableBuffer = bytes | WriteableBuffer

(3) If the definition of ReadableBuffer is moved to builtins, the errors in builtins and io that you see in "slimmed-down typeshed" continue to be reported by mypy.

(4) The errors in the io module that you see in "slimmed-down typeshed" do not exist if io defines its own version of ReadableBuffer, instead of importing the definition from _typeshed or builtins.


Expected behaviour

No error should be reported; the semantics of typing.Union and PEP 604 union-type expressions should be identical.


mypy version

0.930. Reproduced on many Python versions by the typeshed CI; local testing done on Python 3.10.


To reproduce

  1. Either clone the typeshed master branch, and apply the above diff manually, or clone my "slimmed-down typeshed branch".
  2. cd into your clone of the typeshed repo.
  3. Run mypy stdlib/abc.pyi --custom-typeshed-dir . (the precise file you run mypy on is irrelevant).

Cc. @sobolevn, @JukkaL, @Akuli

hauntsaninja commented 2 years ago

I think you miiiiight be able to use PEP 613 as a workaround. Also note, your slimmed down branch is a little broken, ReadableBuffer does seem actually undefined.

hauntsaninja commented 2 years ago

Import cycles involving builtins can do some scary things, here's a bug I ran into: https://github.com/python/mypy/issues/11535

hauntsaninja commented 2 years ago

I did some tracing of the code involved.

Looking at semanal's can_be_type_alias (called by check_and_set_up_type_alias, called by visit_assignment_stmt). https://github.com/python/mypy/blob/1beed9af48bed07033490f8cf2d4e08229450c5f/mypy/semanal.py#L2190

If we encounter ReadableBuffer = Union[...], we have rv is a IndexExpr, we go into is_type_ref and typing.Union is included in valid_refs = type_constructors so we return True and all is well. We don't even look at the contents of the Union.

If we encounter ReadableBuffer = ReadOnlyBuffer | WriteableBuffer we check to see if both sides of "|" can be a type alias.

This check succeeds for the left side ('NameExpr(ReadOnlyBuffer [_typeshed.ReadOnlyBuffer])'), but fails for the right side 'NameExpr(WriteableBuffer)'.

In : self.lookup(rv.left.name, rv).node
<mypy.nodes.TypeAlias object at 0x106f8c2c0>
In : self.lookup(rv.right.name, rv).node
<mypy.nodes.PlaceholderNode object at 0x106f87e20>

So mypy knows that ReadOnlyBuffer is a type alias and so is_type_ref is okay with it. But all it has for WriteableBuffer is a placeholder (likely because of the import cycles), and it is not okay with that -- if n and isinstance(n.node, PlaceholderNode) and n.node.becomes_typeinfo is False.

So two things that could mechanically fix this: 1) If WriteableBuffer wasn't a placeholder. It's WriteableBuffer = Union[...], so mypy could easily know at this point that it is a type alias, even if ... involves horrendous import cycles. Although obviously we'd want our solution to still work if WriteableBuffer used PEP 604. 2) It's true that WriteableBuffer is indeed not going to become a type info, but it is going to become a type alias and that should count for the purposes of is_type_ref. It looks like the placeholder's becomes_typeinfo is basically just used for this kind of thing, so I think we could repurpose it as becomes_type.

Unfortunately, I'm not yet sure what the best way to ensure either of those.

The secret to both lies back in visit_assignment_stmt: https://github.com/python/mypy/blob/1beed9af48bed07033490f8cf2d4e08229450c5f/mypy/semanal.py#L2070

I don't fully understand the reference tagging thing, but tldr the import cycle causes us to skip calling check_and_set_up_type_alias for WriteableBuffer for now, so all mypy has is a placeholder node.

Moreover, the placeholder's becomes_typeinfo is False, because we call mark_incomplete without specifying becomes_typeinfo=True https://github.com/python/mypy/blob/1beed9af48bed07033490f8cf2d4e08229450c5f/mypy/semanal.py#L2075 I'm not sure how we could know becomes_typeinfo=True at this point, although we could know could_become_type.

.... 🤷 So next step is probably to see the impact of changing the semantics of becomes_typeinfo to could_become_type. Unless you see a better fix.

Anyway, for typeshed purposes, maybe just use PEP 613 :-) (works because we don't need the result of can_be_type_alias)

AlexWaygood commented 2 years ago

Also note, your slimmed down branch is a little broken, ReadableBuffer does seem actually undefined.

It's defined here in my branch, no? 🙂 note that my branch doesn't make any kind of sense from a typing perspective, I was just trying to delete as many modules as possible while still being able to reproduce the bug!!

AlexWaygood commented 2 years ago

I did some tracing of the code involved.

Thanks for taking the time to look into it!! This is a great rundown.

Anyway, for typeshed purposes, maybe just use PEP 613 :-) (works because we don't need the result of can_be_type_alias)

Unfortunately, I think PEP 613 may still be unuseable for typeshed purposes — I'm not sure if pytype supports it yet 😕 (https://github.com/google/pytype/issues/787) but, worth a shot!

For this specific instance of the bug, it seems like it can be fixed just by doing from mmap import mmap in _typeshed (but my worry is that quite a few things are imported in typeshed/stdlib/builtins.pyi, so there might be other instances of this bug — haven't investigated that yet...)

AlexWaygood commented 2 years ago

Also note, your slimmed down branch is a little broken, ReadableBuffer does seem actually undefined.

Oh wait, yes, lol — I was experimenting to see if the io errors are still there if it is imported in io from _typeshed, but isn't imported in builtins from _typeshed. (Answer: yes, they are — which I think is because other things are still imported from _typeshed in builtins, meaning you still get cyclic references.)

I left that in by mistake when I pushed to my remote branch — sorry for the confusion!!

AlexWaygood commented 2 years ago

I left that in by mistake when I pushed to my remote branch — sorry for the confusion!!

"smol typeshed" should now be fixed by https://github.com/AlexWaygood/typeshed/commit/79f1d16dbf7f3ab0f23ee6db908816109142ace4 -- the only error from mypy is now:

stdlib\io.pyi:5: error: Variable "_typeshed.ReadableBuffer" is not valid as a type
stdlib\io.pyi:5: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
AlexWaygood commented 2 years ago

I left that in by mistake when I pushed to my remote branch — sorry for the confusion!!

"smol typeshed" should now be fixed by AlexWaygood/typeshed@79f1d16 -- the only error from mypy is now:

stdlib\io.pyi:5: error: Variable "_typeshed.ReadableBuffer" is not valid as a type
stdlib\io.pyi:5: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases

Okay, this is very strange. If I'm in the branch of typeshed with nearly all modules deleted, and ReadableBuffer only imported in io, and I run typeshed's stubtest_stdlib.py script, I get this output (the output I reported immediately above):

(.venv3) C:\Users\Alex\Desktop\Code dump\typeshed>python tests/stubtest_stdlib.py
C:\Users\Alex\Desktop\Code dump\typeshed\.venv3\Scripts\python.exe -m mypy.stubtest --check-typeshed --custom-typeshed-dir . --allowlist tests\stubtest_allowlists\py3_common.txt --allowlist tests\stubtest_allowlists\py310.txt --allowlist tests\stubtest_allowlists\win32.txt --allowlist tests\stubtest_allowlists\win32-py310.txt
error: not checking stubs due to mypy build errors:
stdlib\io.pyi:5: error: Variable "_typeshed.ReadableBuffer" is not valid as a type
stdlib\io.pyi:5: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases

But if I just run mypy, I get no errors at all in the branch with ReadableBuffer only imported in io:

(.venv3) C:\Users\Alex\Desktop\Code dump\typeshed>mypy --custom-typeshed-dir . stdlib/abc.pyi
Success: no issues found in 1 source file

Any idea why mypy 0.930 is succeeding, but stubtest 0.930 is failing with "mypy build errors"??

Akuli commented 2 years ago

mypy 0.930 is succeeding, but stubtest 0.930 is failing with "mypy build errors"??

Currently https://github.com/python/typeshed/pull/6819 has the same problem.

AlexWaygood commented 2 years ago

You should now be able to reproduce the bug in "smol typeshed" by simply running mypy instead of running stubtest after https://github.com/AlexWaygood/typeshed/commit/01496f26a36b9f81de5783e96ec72d87348dde74 ...

AlexWaygood commented 2 years ago

Okay, I think I have managed to create a repro for this issue that is only 11 lines of code, spread over 6 files. SO, to reproduce:

  1. Make sure mypy 0.931 is installed.
  2. Clone this repo: https://github.com/AlexWaygood/mypy-604-repro
  3. cd in to the repo
  4. Run mypy .

Here's what the minimal repro looks like:

# _typeshed.pyi
import mmap
WriteableBuffer = mmap.mmap
ReadableBuffer = str | WriteableBuffer

# abc.pyi: empty file

# builtins.pyi
import typing
from _typeshed import ReadableBuffer
class str: ...
class ellipsis: ...
Foo: ReadableBuffer

# collections.pyi: empty file

# mmap.pyi
class mmap: ...

# typing.pyi
import abc
import collections
hauntsaninja commented 2 years ago

To clarify current state, this issue is patched for stubs, but remains for py files (where we can't blindly treat x = y | z as always being a type alias). There's an xfail test case.

If you run into this, your best option is likely using an explicit type alias (x: TypeAlias = y | z).