python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.61k stars 242 forks source link

Overhaul the python typing system to make typing a 'first class citizen' with a dedicated typing syntax, like match patterns #953

Open KotlinIsland opened 3 years ago

KotlinIsland commented 3 years ago

Pythons implementation of typing at the language level has been very limited, with only the absolute minimal changes to the grammar being implemented:

Both of these syntaxes are just normal expressions that are evaluated (or are just strings if __future__.annotations is on).

That has meant that the entirety of the typing landscape has been implemented 'in the language' using existing syntax and functionality to try and emulate typing as a first class language feature.

This has lead to modifications to the standard library to attempt to better support these usages (eg: special casing type[...], defining __or__ on type and the introduction of __class_getitem__ among others). So all these objects now have a dual nature, they have functions to support their 'normal' usages, and also a bunch of functions to support building type annotations.

The authors of PEP-586 have commented on this problem(thanks @sobolevn):

we feel overhauling the syntax of types in Python is not within the scope of this PEP: it would be best to have that discussion in a separate PEP

Due to the nature of typing being 'implemented in the language' instead of in the grammar/interpreter has lead to a myriad of complexities, edge cases, limitations and confusion. I will list a few of these side effect here. also showing a more optimal syntax that would be possible if typing were to be overhauled:

Fake types and type alias

Fake types such as None and ForwardRefs lead to runtime errors(either in annotations or upon calling get_type_hints):

from typing import TypeAlias

Str: TypeAlias = "str"
Int: TypeAlias = "int"

a: Str | Int = 0  # SUS ALERT

There could be a dedicated typealias keyword like other languages

typealias foo = int | str

(https://github.com/python/mypy/issues/11582)

Shout out to NewType

MyStr = NewType("MyStr", str)

Could be

newtype MyStr = str

Typing imports

A large amount of types need to be imported from typing, leading to a boilerplate of imports in every module:

from typing import NoReturn, TypedDict, TypeAlias, TypeVar, Generic, Protocol, etc, etc, etc, etc

There could be a 'builtins' for type annotations that are usable by default.

Also there is no way to import a type statically, see #928

Type parameters

Are one of the worst incantations imaginable:

from typing import TypeVar, Generic

T_cont = TypeVar("T_cont", contravariant=True, bound=str)

class A(Generic[T_cont]):
    ...

Could be

class A[in T_cont: str]:
    ...

Also you can't specify type parameters explicitly on function call sites:

foo_result: int | str = foo(1, "a")
bar(foo_result)

Could be:

bar(foo[int | str](1, "a"))

Literal syntax

a: Literal[1, 2] = 1

Could be

a: 1 | 2 = 1

cast

Yet another import, and very clunky to boot. And why is the type first? everywhere else(isinstance, issubclass, type annotations) types go second.

cast(int, a)

Could be as or similar like other languages:

a as int

Callable

In error messages mypy shows this as (int, str) -> bool

a: Callable[[int, str], bool]

Could be

a: (int, str) -> bool

Confusion

Because all these type machineries exist at runtime, it creates very confusing situations where an annotation is used in a value position:

A = int | str
A()  # is this valid?
foo(A)  # is this valid?
B: type[int | str] = int | str  # is this valid?

https://github.com/microsoft/pyright/issues/2522 shows that it's very easy to overlook this situation.

Match Patterns

Enter match/case statements, which have their own unique language level syntax for dealing with all the concepts needed in a comprehensive way. For example matching a union of strings: "a" | "b" which in a typing annotation would just be evaluated as a normal expression and raise a TypeError.

match "a":
    case "a" | "b":  # ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ๐Ÿ˜ณ
          print("AMONGUS")

Instead of endlessly implementing special casing and workarounds, please overhaul the entire implementation of typing in the language from the ground up. Currently typing in python is very rough around the edges and could be vastly improved based upon the experience and usages gathered over the years since it's inception.

Proposal

Add the concept of a 'type context' that has it's own unique semantic meaning, separating the type realm functionality from the value(value context) realm functionality. This would make things such as the type union | much simpler and consistent (see issue where some types are not actually types at all, and don't have __or__). These operations would still produce type machinery representations:

class A:
    a: int | str
A.__annotations__  # {'a': UnionType[int, str]}

And these machinereies could be constructed manually in a 'value context' when needed:

a = UnionType(int, str)

Example

Bar = 1 | 2
    # ^ value context
typealias Foo = 1 | 2
              # ^ type context

Evaluating type syntax

from __future__ import annotations
from typing import get_type_hints

class A:
    x: "a" | "b"

get_type_hints(A)  # Currently will raise a TypeError

Under the hood, get_type_hints simply evals the annotation as a module level expression, I suggest adding a eval_type function that will evaluate it as a type context expression, not a value context

sobolevn commented 3 years ago

Thanks for your ideas!

Unfortunately, this is not really how proposals work in Python.

First of all, there are historic, technical, and stylistics reasons behind your every "why not ...?" question. Most of the ideas you propose here were explicitly discussed during original PEPs. For example: "why Literal[1] and not just 1"? https://www.python.org/dev/peps/pep-0586/#adding-more-concise-syntax

If you really want to make a new proposal, take a look how PEP process works: https://www.python.org/dev/peps/pep-0001/ You can start with just the parts you think are the most important. For example, newtype or typealias keyword. Provide a sample runtime implementation and Python's grammar changes. Think about backwards compat: should this really be a keyword or a soft keyword? What problem does this solve? What are the alternatives? How hard it would be to implement in type checkers?

In this case - it would a valuable and actionable piece of feedback.

Moreover, I kindly suggest not to use harsh personal assessments in technical discussions. It does not really help, only complicates things right from the start.

KotlinIsland commented 3 years ago

Thanks for the response!

Most of the ideas you propose here were explicitly discussed during original PEPs

I do understand why there's limitations like that, I'm just painting broad strokes about the "bolted on" nature of pythons current typing landscape.

The pep even acknowledges this problem:

we feel overhauling the syntax of types in Python is not within the scope of this PEP: it would be best to have that discussion in a separate PEP

If you really want to make a new proposal

I would love to make a pep addressing overhauling the entire python typing system, but I don't think I have the depth of knowledge and understanding to lead that charge, I'm just raising the issue here to draw attention to it.

For example, newtype or typealias keyword

I do see that those could be implemented in isolation to a radical redesign, I'm just trying to demonstrate that typing isn't a first class citizen in Python, it's bolted on and implemented in the language.

I kindly suggest not to use harsh personal assessments

Sorry! I'm not meaning to attack anyone or anything, I'm just trying to exemplify the frustrations with using pythons typing system.

JukkaL commented 3 years ago

I would love to make a pep addressing overhauling the entire python typing system, but I don't think I have the depth of knowledge and understanding to lead that charge, I'm just raising the issue here to draw attention to it.

These ideas aren't new, and the syntactic compromises were well understood when PEP 484 was written. At the time it didn't seem realistic for drastic changes to Python syntax and semantics to be accepted, just to facilitate type checking, so the changes to Python have been quite incremental and gradual. I still think that an overhaul of the entire Python typing syntax has very little chance of being accepted by the SC, and more incremental changes stand a better chance.

I agree with @sobolevn above -- the best way to get these improved is for somebody to write PEPs, implement prototypes, address feedback by the community, and so on. I personally don't expect that all of the suggestions above would be accepted as PEPs, but some of them sound feasible to me, and actually there have been discussion on typing-sig@ about a better syntax for callables recently, for example. (And don't let my expectations discourage you. I don't make the decisions!)

I also think that there would be more interest in the community if you'd create a separate issue for each new idea, and research previous work (e.g. about callable type syntax) to avoid repeating ideas that are already being discussed elsewhere.

KotlinIsland commented 3 years ago

Thanks for the response!

I really only have one proposal, 'type contexts' and those contexts having their own syntax. The examples are just demonstrating the consequences of the current design.

I get that a bunch of these ideas could also be 'implemented in the language' with the current design, but if that happened it would just make me sad that the designs aren't reaching their true potential due to being held back by the current design.

Tldr: I want a: int | str to not call any __or__ method, I want it to be understood by the interpreter as a type union.

danmou commented 3 years ago

I like the idea of having dedicated typing syntax instead of trying to make it all fit into the existing Python syntax. I guess it would also lower the barrier to getting SC approval for further syntax extensions, such as the new callable syntax, since the changes would not affect "regular" python (and potentially block other Python syntax extensions). I guess this was considered already in the early days of typing, but maybe worth reevaluating now that typing is more mature?

I think @KotlinIsland's original post should be seen mainly as motivation for why to introduce a standalone typing syntax. If this turns into PEP, it should just define the rules for in which contexts this syntax is used, plus a minimal syntax that can parse the currently used typing constructs. We don't want to break backwards compatibility so it's important to get that part right. Once this is in place, more syntax can be added down the line (and things like typing.TypeVarcan be deprecated eventually).

gvanrossum commented 3 years ago

I like the idea of having dedicated typing syntax instead of trying to make it all fit into the existing Python syntax. I guess it would also lower the barrier to getting SC approval for further syntax extensions, such as the new callable syntax, since the changes would not affect "regular" python (and potentially block other Python syntax extensions).

Not to crush people's hopes, but the SC has explicitly declared that they don't want types to use dedicated syntax -- whatever is valid in an annotation syntactically must also be syntactically valid in an expression.

KotlinIsland commented 3 years ago

Not to crush people's hopes, but the SC has explicitly declared that they don't want types to use dedicated syntax -- whatever is valid in an annotation syntactically must also be syntactically valid in an expression.

Do you know if there are any discussions available to read regarding this decision? That would be greatly appreciated.

Even if this change were to be implemented, it could still meet the requirement of not changing the syntax, but just change the way that annotations are interpreted, their semantic meaning. Such that a: 1 | 2 is still valid syntax, but the way it is interpreted is entirely different.

I really disagree with this decision by the SC, I feel that Pythons typing is intensely janky and confusing, Python has had the reputation of being easy to learn, but the way that typing has been handled I can't imagine someone that's learning being able wrapping their head around the differences between typing/types/typing_extensions, the difference between a 'type' and a builtins.type, the value representation of a type annotation and how it differs in a static/runtime context. It's not clear, approachable or consistent in any way.

There is so much complexity and mental burden involved with the current solution that I have seen developers turning away from Python, often commenting that the types are a "mess" or a "joke".