python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.2k stars 2.78k forks source link

Wrong test result for default values of Mapping arguments #16716

Open LLyaudet opened 8 months ago

LLyaudet commented 8 months ago

Bug Report

It seems that the validation of

def foo(x: Mapping[str, str] = some_default) -> None:
    pass 

yields strange results depending on the declared type of some_default.

To Reproduce

use mypy --strict on the following file

"""
This file is part of python-none-objects library.

python-none-objects is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

python-none-objects is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License
along with python-none-objects.
If not, see <http://www.gnu.org/licenses/>.

©Copyright 2023 Laurent Lyaudet
"""
"""
The empty tuple is really useful.
But it is implementation dependent for it to be a constant:
https://docs.python.org/3.12/reference/expressions.html#parenthesized-forms
https://stackoverflow.com/questions/41983180/is-the-empty-tuple-in-python-a-constant
https://stackoverflow.com/questions/8185776/compare-object-to-empty-tuple-with-the-is-operator-in-python-2-x
https://stackoverflow.com/questions/38328857/why-does-is-return-true-when-is-and-is-return-false
https://stackoverflow.com/questions/14135542/how-is-tuple-implemented-in-cpython
I'm wondering if there would be additional efficiency gains
to treat the empty tuple and the constants here differently
at execution of Python scripts.
"""
# from typing import Iterable, Container, Collection, Mapping
from types import MappingProxyType
from typing import Any, Mapping, Never

NoneIterable = ()
NoneContainer = NoneIterable
NoneCollection = NoneIterable
NoneMapping1: Mapping[Any, str] = {123: "abc"}  # MappingProxyType({})
NoneMapping2: Mapping[str, Any] = {"abc": 123}  # MappingProxyType({})
NoneMapping3: Mapping[Never, Any] = MappingProxyType({})
NoneMapping4: Mapping[Any, Never] = MappingProxyType({})
NoneMapping5: Mapping[Any, Any] = {"abc": 123}  # MappingProxyType({})
NoneMapping6: Mapping[Never, Never] = MappingProxyType({})

def foo1(x: Mapping[str, str] = NoneMapping1) -> None:
    # Pass typing but {123: "abc"} is a Mapping[Any, "str"]
    # and it should not be valid.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo2(x: Mapping[str, str] = NoneMapping2) -> None:
    # Pass typing but {"abc": 123} is a Mapping[str, Any]
    # and it should not be valid.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo3(x: Mapping[str, str] = NoneMapping3) -> None:
    # Fails typing but only an empty mapping is a Mapping[Never, Any]
    # and it is also a Mapping[str, str].
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo4(x: Mapping[str, str] = NoneMapping4) -> None:
    # Pass typing
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo5(x: Mapping[str, str] = NoneMapping5) -> None:
    # Pass typing but {"abc": 123} is a Mapping[Any, Any]
    # and it should not be valid.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo6(x: Mapping[str, str] = NoneMapping6) -> None:
    # Fails typing but only an empty mapping is a Mapping[Never, Never]
    # and it is also a Mapping[str, str].
    for y, z in x.items():
        print(f"foo {y} bar {z}")

# Or even better, a reproducible playground link https://mypy-play.net/ (use the "Gist" button)

Expected Behavior

I thought that cases 3, 4, 6 would pass since the only possible value for the default is an empty mapping. And I thought that cases 1, 2, 5 would fail.

Actual Behavior

$mypy --strict ./test_mypy_on_pno.py test_mypy_on_pno.py:60: error: Incompatible default for argument "x" (default has type "Mapping[NoReturn, Any]", argument has type "Mapping[str, str]") [assignment] test_mypy_on_pno.py:80: error: Incompatible default for argument "x" (default has type "Mapping[NoReturn, NoReturn]", argument has type "Mapping[str, str]") [assignment]

Only cases 3 and 6 fails, when both should pass.

Your Environment

LLyaudet commented 8 months ago

Here is the Gist and playground: Gist URL: https://gist.github.com/mypy-play/95a31f3efd9fa7d5336d53d4fd4b9749 Playground URL: https://mypy-play.net/?mypy=latest&python=3.12&gist=95a31f3efd9fa7d5336d53d4fd4b9749

erictraut commented 8 months ago

Mypy's behavior looks correct to me here. FWIW, pyright's output is the same.

The first type parameter for Mapping is invariant, and the second is covariant. Also, Any is compatible with any type (even in an invariant context). Those two rules explain all of the behaviors you're seeing.

LLyaudet commented 8 months ago

Hello :)

Thanks for your answer. I'm not an expert of type theory and I understand set theory better so I see types as sets of instances.

Hence, for me a mapping is like its graph: this is the set of tuples (in, out) of the mapping. When the mapping is empty, you can use whatever type for the in part and the out part. From the point of view of semantics it will always validate by emptiness. But the converse also holds for me: whenever in type is Never, or out type is Never, then the mapping must be empty, and for all type T, Mapping[Never, T] = Mapping[T, Never] = Mapping[Never, Never]. This is a semantics point of view, but it is probably impossible to deduce all semantics theorems on types from syntactic rules.

Regarding the fact that Any is always compatible, I think cases 1 and 2 clearly show that it is wrong and yields bad validation and bad semantics. From my point of view, there is an "Everything is potatoes" syndrom. Python is mixing semantics of types and pseudo-types for validation. Mentally, I distinguish between:

It could be a further design decision or an option at the discretion of the Python user to have Unknown behave like Any or like AlwaysCompatible.

Best regards, Laurent Lyaudet

erictraut commented 8 months ago

Any is not a top type. It's a gradual type that acts somewhat like both a top and a bottom type — i.e. it's compatible with all types. It's operates outside of the set theoretic framework of type theory. It's effectively the AlwaysCompatible concept that you describe above.

Never is a bottom type. It's a subtype of all other types.

There is no Unknown or AlwaysCompatible type officially defined in the Python type system. Pyright implements an Unknown type that is effectively how you've described it, but it acts like Any. Mypy doesn't implement any such concept.

The core type system has no notion of mappings, containers, or an "empty" container. It's more general than that.

The Mapping type in Python has two type parameters. The first type parameter is invariant. That means Mapping[T1, ...] is compatible with Mapping[T2, ...] only if T1 is equivalent to T2 or one (or both) of them is Any. The second type parameter is covariant, which means that Mapping[..., T1] is compatible with Mapping[..., T2] only if T1 is a subtype of T2 or one (or both) of them is Any.

As I said above, I think mypy and pyright are doing the correct thing here. This isn't a bug, and I think it can be closed.

If you want to understand the Python type system better, you may find this documentation useful. Also, PEP 483 contains some useful information about the type theory concepts that form the basis for the Python static type system.

LLyaudet commented 8 months ago

Any is not a top type. [...] It's effectively the AlwaysCompatible concept that you describe above.

Thanks, nice to know. Then I have a problem of naming. Because I think a true top type is needed. According to wikipedia, it is the object type that is the true top type in Python: https://en.wikipedia.org/wiki/Top_type I will replace Any in my code with object to see if it works.

Never is a bottom type. It's a subtype of all other types.

I know that.

There is no Unknown or AlwaysCompatible type officially defined in the Python type system. Pyright implements an Unknown type that is effectively how you've described it, but it acts like Any. Mypy doesn't implement any such concept.

Thanks, I will look in Pyright to see how it works there. It would be a small step forward to have Unknown work like object or like Any at the choice of the user. You start adding types in your app, you choose "Unknownwork likeAny". When you're almost done, you choose "Unknown work like object", and correct all the errors that pop up. Thus you can enhance you app progressively.

My intuitive comprehension of NOT Mapping[T, U], Function[T, U] is that it should be covariant on both T and U. Since in the sets of tuples of the graph of the mapping, there is no "preference" or special semantic for in or out. For example, if T and U can be N the natural numbers or R the real numbers, then obviously:

Everything is symmetric. But as I said, I'm not an expert of type theory and I need to read more on the topic to see where my intuition may be wrong. In mathematics a mapping is a total function. It means that you must have a tuple (in, out) for any value "in" in T. Then in that case, Mapping[N, N] is not included in Mapping[R, N] since a mapping with domain N is obviously not a mapping of domain R. Thus the problem must come from the fact that my code (and probably other code) abuse the mathematical concept of mapping. It also means that typing a finite Dict[str, str] as a subtype of Mapping[str, str] is plain wrong. Only a function that accepts any string can be typed Mapping[str, str]. I think there is a lot of confusion that stems from bad translation of naming and concepts between the mathematical theory world and the implementation.

"En plein dans le mille":

from typing import Dict, Mapping

a: Dict[str, str] = {"abc": "def"}

def foo(x: Mapping[str, str]):
    return x["toto"]

foo(a)

yields no error, when it should if it followed the mathematical theory.

Gist URL: https://gist.github.com/mypy-play/2cc82652abb2d8229f5ab28015ce6826 Playground URL: https://mypy-play.net/?mypy=latest&python=3.12&gist=2cc82652abb2d8229f5ab28015ce6826 Python abuses the Mapping name and mix its semantic between total function and collection with __get_item__.

LLyaudet commented 8 months ago

Python "Mappings" should be covariant in the first parameter.

LLyaudet commented 8 months ago

With object instead of Any, everything fails. So there is still a bug since cases 3, 4, and 6 should pass from a semantic point of view of "Python Mappings" (not from a point of view of mathematical mappings).

Gist URL: https://gist.github.com/mypy-play/8d6e68b0b3a772f0ece10aaea2318c1b Playground URL: https://mypy-play.net/?mypy=latest&python=3.12&gist=8d6e68b0b3a772f0ece10aaea2318c1b

"""
This file is part of python-none-objects library.

python-none-objects is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

python-none-objects is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License
along with python-none-objects.
If not, see <http://www.gnu.org/licenses/>.

©Copyright 2023 Laurent Lyaudet
"""
"""
The empty tuple is really useful.
But it is implementation dependent for it to be a constant:
https://docs.python.org/3.12/reference/expressions.html#parenthesized-forms
https://stackoverflow.com/questions/41983180/is-the-empty-tuple-in-python-a-constant
https://stackoverflow.com/questions/8185776/compare-object-to-empty-tuple-with-the-is-operator-in-python-2-x
https://stackoverflow.com/questions/38328857/why-does-is-return-true-when-is-and-is-return-false
https://stackoverflow.com/questions/14135542/how-is-tuple-implemented-in-cpython
I'm wondering if there would be additional efficiency gains
to treat the empty tuple and the constants here differently
at execution of Python scripts.
"""
# from typing import Iterable, Container, Collection, Mapping
from types import MappingProxyType
from typing import Any, Mapping, Never

NoneIterable = ()
NoneContainer = NoneIterable
NoneCollection = NoneIterable
NoneMapping1: Mapping[object, str] = {123: "abc"}  # MappingProxyType({})
NoneMapping2: Mapping[str, object] = {"abc": 123}  # MappingProxyType({})
NoneMapping3: Mapping[Never, object] = MappingProxyType({})
NoneMapping4: Mapping[object, Never] = MappingProxyType({})
NoneMapping5: Mapping[object, object] = {"abc": 123}  # MappingProxyType({})
NoneMapping6: Mapping[Never, Never] = MappingProxyType({})

def foo1(x: Mapping[str, str] = NoneMapping1) -> None:
    # Fails typing OK.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo2(x: Mapping[str, str] = NoneMapping2) -> None:
    # Fails typing OK.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo3(x: Mapping[str, str] = NoneMapping3) -> None:
    # Fails typing but only an empty mapping is a Mapping[Never, object]
    # and it is also a Mapping[str, str].
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo4(x: Mapping[str, str] = NoneMapping4) -> None:
    # Fails typing but only an empty mapping is a Mapping[object, Never]
    # and it is also a Mapping[str, str].
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo5(x: Mapping[str, str] = NoneMapping5) -> None:
    # Fails typing OK.
    for y, z in x.items():
        print(f"foo {y} bar {z}")

def foo6(x: Mapping[str, str] = NoneMapping6) -> None:
    # Fails typing but only an empty mapping is a Mapping[Never, Never]
    # and it is also a Mapping[str, str].
    for y, z in x.items():
        print(f"foo {y} bar {z}")
erictraut commented 8 months ago

Python already has a top type: object. There's no need to add another one.

Any is specifically for gradual typing.

This is all spelled out in the Python typing spec and the historical PEPs related to typing. Mypy follows the typing spec, and so does pyright. If you're interested in proposing modifications or additions to that spec, the python.org typing forum is the recommended place to do that. The mypy issue tracker isn't the right place to suggest spec changes.

It's unlikely that the Python typing community will be interested in making changes to the spec that break backward compatibility. Many of your suggestions (including the suggestion that Mapping should be covariant in the first parameter) would break millions of lines of typed Python, so you will probably get little or no support for such suggestions.

So there is still a bug since cases 3, 4, and 6 should pass from a semantic point of view

No, mypy is working correctly here according to the typing spec. This isn't a bug. You can paste the same code into the pyright playground and see that it reports the same type violations.

There's an effort to formalize the typing spec and fill in missing and under-specified aspects. It's still missing big portions of content, but you can find it here. If you still have questions after reading the content I've linked to in this thread, this python/typing discussion forum is a good place to get answers and clarifications.

LLyaudet commented 8 months ago

I don't understand why you wrote this:

Python already has a top type: object. There's no need to add another one.

Any is specifically for gradual typing.

since in a previous comment I wrote:

According to wikipedia, it is the object type that is the true top type in Python: https://en.wikipedia.org/wiki/Top_type I will replace Any in my code with object to see if it works.

and I used object in a further comment before yours...

Thanks for the links. I already posted on discuss.python.org for questions on typing. I don't know if I will find in the links the reason behind making invariant the first parameter of Mapping. It might be a bug in the spec, it might be an use case I don't see, but my semantic analysis of use cases 3, 4, and 6 is valid whatever the spec. For now, I will use Mapping[Any, Never] for NoneMapping since it seems it is the only way to do it.