TypeForm[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc)

davidfstr commented 3 years ago

(An earlier version of this post used TypeAnnotation rather than TypeForm as the initially proposed spelling for the concept described here)

Feature

A new special form TypeForm[T] which is conceptually similar to Type[T] but is inhabited by not only regular types like int and str, but also by anything "typelike" that can be used in the position of a type annotation at runtime, including special forms like Union[int, str], Literal['foo'], List[int], MyTypedDict, etc.

Pitch

Being able to represent something like TypeForm[T] enables writing type signatures for new kinds of functions that can operate on arbitrary type annotation objects at runtime. For example:

# Returns `value` if it conforms to the specified type annotation using typechecker subtyping rules.
def trycast(typelike: TypeForm[T], value: object) -> Optional[T]: ...

# Returns whether the specified value can be assigned to a variable with the specified type annotation using typechecker subtyping rules.
def isassignable(value: object, typelike: TypeForm[T]) -> bool: ...

Several people have indicated interest in a way to spell this concept:

@ltworf: https://github.com/python/mypy/issues/9003
@glyph: https://github.com/python/mypy/issues/9003#issuecomment-653353318
@davidfstr (yours truly): https://github.com/python/mypy/issues/9003#issuecomment-734648129
@Nico-Kialo: https://github.com/python/mypy/issues/8992
@hmvp: https://github.com/python/mypy/issues/8992#issuecomment-647331625

For a more in-depth motivational example showing how I can use something like TypeForm[T] to greatly simplify parsing JSON objects received by Python web applications, see my recent thread on typing-sig:

Spelling the type of a runtime type annotation object (typing-sig thread)

If there is interest from the core mypy developers, I'm willing to do the related specification and implementation work in mypy.

davidfstr commented 2 years ago

Update: I am back from an extended hiatus.

@davidfstr commented on Jan 7, 2021:

Update: I've drafted an initial PEP for TypeForm.

So I have. It is currently in Google Docs form in case anyone wants a sneak peek.

I'm currently resuming my work in shepherding PEP 655 (Required[] and NotRequired[] in TypedDict) and plan to continue design work on TypeForm afterward.

freundTech commented 2 years ago

I recently ran into this problem while trying to validate TypedDicts and managed to adapt @hauntsaninja workaround to allow access to the type at runtime. This is pretty hacky, so I'm looking forward to TypeForm. People who need a solution now however can try my workaround.

from typing import *

T = TypeVar("T")

class TypeAnnotation(Generic[T]):
    def __class_getitem__(cls, typ):
        class _TypeAnnotation(TypeAnnotation):
            _type = typ
            def trycast(self, value: object) -> T:
                # check value against _type here
                return cast(T, value)

        return _TypeAnnotation

    def trycast(self, value: object) -> T:
        pass

TypeAnnotation[Optional[int]]().trycast(object())

davidfstr commented 2 years ago

Update: Before continuing design work on TypeForm, I am continuing to shepherd PEP 655 (Required[] for TypedDict) which is now pending Steering Council review & final implementation prior to Python 3.11b1 (April 30, 2022).

I forecast that there will not be enough time to design/prototype/approve/implement TypeForm for Python 3.11, so I expect TypeForm will slip to (probably) Python 3.12.

Conchylicultor commented 2 years ago

What is the relation between TypeForm and TypeAlias ? It seems that both concepts have some overlap.

For me, TypeAlias is already the Type of a Typing annotation (a meta type). So I'm a little confused why a new TypeForm concept is required rather than extending the existing TypeAlias.

Let's take the following code:

x: TypeAlias = Union[A, B]

fn(x=Union[A, B])  # Why not TypeAlias here ?

Both x (in global scope and in fn) both are assigned to the same value (Union[A, B]), so it seems very natural that their typing annotation should be identical too, no ?

def fn(x: TypeAlias[T]):
  ...

It would be nice if somehow those 2 concepts could be unified.

erictraut commented 2 years ago

A type alias is a very different concept than a type form. There's really no overlap between the two.

A type alias defines a symbol that refers to another type. In most other languages, there is a keyword in the language that allows for the specification of a type alias. In Python, the TypeAlias was added to serve this purpose, but TypeAlias is not really a type annotation. It doesn't indicate the type of the alias; it rather indicates that the symbol should be treated as a type alias. TypeAlias can be used only to declare a type alias symbol. It can't be used to decorate the type of a parameter, function return, etc., it can't be used as a type argument for a generic type, and it can't be used as a bound when defining a TypeVar.

TypeForm, by contrast, is a type annotation. It's similar to Type except that it also covers various special forms that are not subtypes of Type. As proposed, it can be used in any place that other type annotations can be used — to annotate parameters, return types, etc.

davidfstr commented 1 year ago

Update: My life has gotten extremely busy this past year - moving cross country, getting a first house, other upcoming major life events - so I don't expect to have enough bandwidth to push forward TypeForm support in a timeline I can foresee.

Consider this a call to anyone who especially cares about TypeForm support to pick up the torch from me.

All the related documentation should already be attached to this thread. Notably the PEP draft in Google Docs.

Vlod-github commented 1 year ago

Personally I'm now leaning toward TypeForm (over TypeAnnotation) because it is consistent with prior documentation and is more succinct to type. It does sound a bit abstract but I expect only relatively advanced developers will be using this concept anyway.

There is such an Entity-Component-System architectural pattern. In this pattern, you can get its component by the entity. In the code, it will look like get_component(type: Type[T], entity: int) -> T | None. And now there is no normal way to implement this in python. I have to use # type: ignore

ippeiukai commented 1 year ago

I think introducing TypeForm can improve the complexity around non-concrete type objects significantly.

Currently, non-concrete (e.g. Protocol) type object Proto is of type Type[Proto] but cannot be assigned to variables of type Type[Proto]. This is given in a special rule in PEP 544. If non-concrete type object like Proto is of type TypeForm[Proto] but not Type[Proto], then it naturally follows that Proto cannot be assigned to variables of type Type[Proto]. No need for the special rule, intuitive to developers, reduced complexity in type checkers.

Following are some related issues on this topic:

It would also be nice to know if abstract classes should be considered concrete or not. I found no spec / PEP on that and perhaps TypeForm PEP can clarify on that point for consistency among type checkers.

sg495 commented 8 months ago

@davidfstr has someone come forward to continue this work? If not, I'm happy to pick it up.

davidfstr commented 8 months ago

@davidfstr has someone come forward to continue this work? If not, I'm happy to pick it up.

Nobody else has come forward so far. Happy to have you pick up the torch @sg495 ^_^

sg495 commented 8 months ago

Fantastic! 🎉🥳

I'll familiarise myself fully with the current status over the coming days. If you have info which you think I should have—in addition to the PEP draft, this issue and the mailing list thread—please let me know.

KotlinIsland commented 8 months ago

I've been implementing it in basedmypy, it's not finished though

ViktorSky commented 7 months ago

From what I've tried, it is possible to reference an object of type SpecialForm, but it has a slight complication.

If you need a simple annotation, you should use bound=Any

import typing

S = typing.TypeVar('S', bound=typing.Any)

def check(val: S) -> S:
    return val

#hint = check(str)  # shows: type[str]  # If you add this the following breaks
hint = check(typing.Optional[str])  # shows: type[str] | type[None]
hint = check(typing.Union[str, int, bool])  # shows: type[str] | type[int] | type[bool]

but for some reason when you try to use it in overloads the typevar arguments change to Any, _SpecialForm

import typing

#S = typing.TypeVar('S', bound=typing.Any)  # fails
S = typing.TypeVar('S', typing.Any, typing._SpecialForm)  # ok
T = typing.TypeVar('T')

@typing.overload
def check(val: type[T]) -> T: ...
@typing.overload
def check(val: S) -> S: ...
def check(val: type[T] | S) -> T | S: ...  # type: ignore

hint = check(str)  # shows: str
hint = check(typing.Optional[str])  # shows: str | None
hint = check(typing.Union[str, int, bool])  # shows: str | int | bool

That is, the way to reference them without error in typevar is using Any, _SpecialForm as arguments

import typing

Any_Or_SpecialFormT = typing.TypeVar('Any_Or_SpecialFormT', typing.Any, typing._SpecialForm)  # type: ignore  # private attr

KholdStare commented 6 months ago

I'm running into the same issues as everyone else relating to Type[T] -> T. I'm in support of the TypeForm proposal. Any updates @sg495 @davidfstr ? 😄

davidfstr commented 6 months ago

@KholdStare Both sg495 (on Jan 17) and mdrissi (on Feb 1) have indicated an interest in moving the TypeForm proposal forward. Presumably they will update this thread when there are further updates.

mikeshardmind commented 6 months ago

Copied over (not verbatim) from a discourse thread:

TypeForm[TF, *Parameters] where TF must be the typeform itself, or Any to indicate handling any type form, and Parameters must be what would be the type arguments to that form would be a significant improvement, as it allows granular handling of type forms

An example of this that would handle a Union (At least if Union[*Ts] also becomes allowed)

@overload
def try_parse_as_value(typ: TypeForm[Union, *Ts], user_input: str) -> Union[*Ts]:
    ...
@overload
def try_parse_as_value(typ: type[T], user_input: str) -> T:
    ...

adriangb commented 6 months ago

Would you envision TypeForm[Union | Annotated, *Ts] work? That would allow doing some sort of match Union: ...

adriangb commented 6 months ago

@sg495 and @mdrissi are you still working on this proposal? Is there any way to follow the progress or assist with it? Thanks!

mikeshardmind commented 6 months ago

Would you envision TypeForm[Union | Annotated, *Ts] work?

No, I'd expect this to need to be expressed with overloads as I don't think most functions operating on these operate only on type forms, but on types as well, and even varying type forms have differing ways you would handle them.

There's also a bit more special-nes to have to define a Union between unparameterized type forms and give it meaning to allow this, and given the actual way in which functions that use this likely need differing handling for differing forms (even if inlined with match-case) I think I'd expect a "function shape" like the below.


@overload
def try_value_as_type(typ: TypeForm[Annotated, T, *Ts], value: Any) -> T:
    ...

@overload
def try_value_as_type(typ: TypeForm[Union, *Ts], value: Any) -> Union[*Ts]:
    ...

@overload
def try_value_as_type(typ: type[T], value: Any) -> T:
    ...

def try_value_as_type(typ: Any, value: Any) -> Any:
    ...

adriangb commented 6 months ago

Thanks for clarifying. I like the TypeForm[Annotated, T, *Ts] part because the special handling of T is very helpful, especially if it's possible to do something like:

def only_accept_ints[T: int](typ: TypeForm[Annotated, T, *Ts]) -> T:
    ...

I assume in these cases Ts could be implicit, that is, there's no need to declare it and type checkers would default to Any for Annotated or a more specialized value depending on the parameters the type form accepts (e.g. type[Any] for Union?).

mikeshardmind commented 6 months ago

I assume in these cases Ts could be implicit, that is, there's no need to declare it and type checkers would default to Any for Annotated or a more specialized value depending on the parameters the type form accepts (e.g. type[Any] for Union?).

probbaly not, and parametrized functions haven't been accepted (yet? there's an open discussion and PEP draft), but if we assume that they either will be, or that if they aren't, people will still write the neccessary equivalent code, the overall idea is solid, and allows expression of more specific validations (using msgspec.Meta as an example, as I'm familiar with it)

def validate_int_annotations[T: int, *Ts](typ: TypeForm[Annotated, T, *Ts], value: T) -> T:
    _type, *annotations = get_origin(typ)
    # ignoring _type, we know it statically to be int or a subtype of int, and matching the value
    for annotation in annotations:
        if isinstance(annotation, msgspec.Meta):
            # validate the specific int is conformant to the constraints
            # expressed inside, such as Meta.gt (greater than)
            # raise if validation vails
    return value

Tinche commented 6 months ago

So is the current plan for TypeForm to work only for Annotated and Union? What about protocols, newtypes, literals etc?

Looking at @mikeshardmind 's overload snippet, I'd also expect

def try_value_as_type(typ: TypeForm[T], value: Any) -> T:
    ...

to, you know, cover everything.

mikeshardmind commented 6 months ago

I expect it to work for All type forms. The above were examples of how I would expect that specific formulation to work with overloads for some common cases, but not exhaustive. I would not want to try and define it exhaustively except to say that it should work for all type forms, and that the first parameter to TypeForm should indicate the unparameterized type form being handled, remaining parameters being the inner parameters (if any) to that type form. (avoiding needing to list every form individually in specification)

Edit: This has been edited, the sentiment is the same, but I added more detail as replies were coming in

Tinche commented 6 months ago

Sweet. This sounds pretty great.

asford commented 6 months ago

Bit of a drive-by, but it would be very useful for runtime type checking to make sure this PEP clarifies the interaction of TypeForm with generic custom TypeGuards.

As discussed in https://github.com/beartype/beartype/issues/255, with the typing clarifications in https://github.com/python/typing-council/issues/18 it's now (probably?) impossible to describe runtime TypeGuards that can accommodate Annotated and other TypeForm like types.

I believe this proposal should/would allow spelling generic, tolerant typeguards as:

def generic_typeguard(value: Any, hint: TypeForm[T]) -> TypeGuard[T]:
    ...

with the declaration that these implementations provide standard type narrowing semantics in the positive case, but permit no narrowing in the negative case.

That is, a check may perform runtime introspection of Annotated metadata and use that that metadata, as long the check only returns True if there is valid narrowing to T. It it permissible to perform more strict value based checking based using Annotated metadata and return False even if it is possible to narrow to T. In that case, as per the standard TypeGuard semantics, the static checker will not narrow.

EDIT

This is reasonably well spelled out in @davidfstr's linked draft: https://docs.google.com/document/d/18UF8V00EVU1-h-BtiVFhXoJkvfL4rHp4ORaenMQL-Zo/edit

davidfstr commented 5 months ago

I will pick up the TypeForm PEP again in earnest after releasing the next version of trycast, in 2-3 weeks. There appears to be a critical mass of interest.

The Motivation section of the PEP needs updating for 2024. Libraries I know about that would benefit from TypeForm include:

isinstance operations: (object, TypeForm[T]) -> TypeGuard[T] or TypeIs[T]
- beartype.is_bearable - Discussion link
  - Wants TypeForm to be able to match Annotated as well so that it can be conveyed to a TypeGuard[T] (or TypeIs[T])
- trycast.isassignable - Roadmap intends to use TypeForm when available
- typeguard.check_type
- ✚ xdsl.isa
cast & converter operations: (object, TypeForm[T]) -> T
- pydantic.TypeAdapter(T).validate_python
  - Wants the ability to instantiate a generic class (TypeAdapter[T]) where T comes from a TypeForm[T].
- trycast.trycast - Roadmap intends to use TypeForm when available
- trycast.checkcast - Roadmap intends to use TypeForm when available
- cattrs.BaseConverter.structure -- CC @Tinche, ✚ "needs this badly"
- typedload.load -- CC @ltworf
- ✚ svcs.svcs_from(...).get(...)
  - ✚ Wants to take *args of type Tuple[TypeForm[T], ...] and return a Tuple[T, ...], where the corresponding entries match up. Workaround possible with overloads for tuple lengths up to N (ex: N=7).
- ✚ mashumaro.JSONDecoder[T].{encode,decode}
  - Similar to pydantic, wants to instantiate a generic class with a TypeForm.
Introspection operations: (TypeForm[T]) -> ...
- typing_inspect.{is_generic_type, is_callable_type, ...} -- CC @ilevkivskyi
- ✚ typing.get_origin, typing.get_args
  - However may want to accept object instead of TypeForm, since it already accepts non-types at runtime
Other/unknown operations
- ✚ openapify
  - ✚ Wants to store a TypeForm in a field, for unknown use later
- ✚ dataclasses.make_dataclass, «the equivalent function» from attrs
  - ✚ Wants to store a TypeForm in a field, for documentation purposes only
    - With two exceptions described below, nothing in @dataclass examines the type specified in the variable annotation.

If you maintain/know a library that would benefit from TypeForm in some other way, please chime in.

Especially useful are links to threads where someone is trying to use a Type[T] to match something like Union[...] or Optional[...] which doesn't work for Type[T] but would work for TypeForm.

Edit: ✚ = Marks any feedback integrated from response-comments below, up to this comment

Tinche commented 5 months ago

@davidfstr cattrs needs this badly, yeah. @hynek's https://github.com/hynek/svcs library ran into this issue too.

Fatal1ty commented 5 months ago

@davidfstr my two cents:

mashumaro uses Any in many places because of this issue
openapify expects from the user type annotations which are currently defined as TypeAnnotation: TypeAlias = Any

ltworf commented 5 months ago

typedload has the issue as well that it is impossible to express what the load function will return (an instance of the passed type)

TeamSpen210 commented 5 months ago

From https://github.com/python/typeshed/issues/11653, dataclass.make_dataclass() would need TypeForm, as well as the attrs equivalent. The same would apply to TypedDict and namedtuple. All these are special cased anyway so it probably wouldn't have a big impact, but it does show a use case in the standard library.

JelleZijlstra commented 5 months ago

Same goes for typing.get_origin, typing.get_args, etc.

superlopuh commented 5 months ago

@davidfstr I work on a compiler in Python that embeds constraints on the values in the IR into the Python type system. Some of these constraints are generic, representing nested constraints, for which we wrote a function that is similar to isinstance that verifies these: isa.

davidfstr commented 5 months ago

Next topic: Naming the concept of "the type of a type annotation object":

TypeForm has been the placeholder name since Dec 2020. (Before that there was briefly TypeAnnotation.)
AnnotationType however was proposed in later discussion and was generally liked as an alternative.

I also like AnnotationType and am leaning toward that as the name:

It contains the word "annotation".
It aligns with the names of a few other special types like EllipsisType, FunctionType, NoneType, etc.
It aligns with the new definition of "annotation expression" which I believe is the equivalent concept in the big typing spec.
It is less jargony than "TypeForm", which primarily takes its name from a "typing special form", which is not a concept most folks have their minds wrapped around.

Comments? Support? Objections?

adriangb commented 5 months ago

I like AnnotationType more 😀

mikeshardmind commented 5 months ago

I don't like AnnotationType as a name. There's a very important distinction between Annotations and Annotation Expressions that has led to confusions in the past, and even putting that aside for a moment, neither all Annotations nor all Annotation Expressions are valid values described by this special form.

This specifically describes typing special forms and not annotations as a whole. I don't think TypeForm being "more jargony" is a large enough detraction to have a name that is actively more misleading about what it describes instead.

erictraut commented 5 months ago

Could you please move this discussion to the typing forum? This isn't a mypy-specific feature. It's a proposed change to the typing spec, so it deserves the visibility and broader input from the community.

davidfstr commented 5 months ago

Could you please move this discussion to the typing forum? [...] it deserves the visibility and broader input from the community.

Sure. I'll make a new thread there in the next few days.

davidfstr commented 5 months ago

In preparation for moving this discussion to the typing forum, I'm currently drafting a new (2024) version of the TypeForm PEP, incorporating various feedback. Hoping to be done later this week.

davidfstr commented 5 months ago

The 2024 version of the TypeForm PEP is ready for review. Please see the thread in the Typing forum.

davidfstr commented 4 months ago

Draft 2 of the TypeForm PEP (2024 edition) is ready for review. Please leave your comments either in that thread or as inline comments in the linked Google Doc.

I especially solicit feedback from maintainers of runtime type checkers:

attrs: @hynek
beartype: @leycec
cattrs: @Tinche
pydantic: @samuelcolvin, @sydney-runkle
trycast: (me)
typeguard: @agronholm

Please see §"Abstract", §"Motivation", and §"Common kinds of functions that would benefit from TypeForm" in the PEP to see how the TypeForm feature relates to specific functions in the library you maintain.

hynek commented 4 months ago

attrs doesn't do anything with types except copying them around (type-checking logic is entirely via a Mypy plugin and/or dataclass transforms), so I don't have feedback. But as you mention in the PEP, my less-known child svcs would benefit! But it seems to be more of a trivial byproduct of a bigger thing that I have to admit don't fully understand. :)

leycec commented 4 months ago

...heh. @beartype and typeguard are the ultimate consumers of this PEP. If anyone cares, we care. Oh, how we care! Actually, our users care even more than we do. Our users deeply care so much they repeatedly ~~prod us with pain sticks~~ inspire us with feature requests until we finally do something about this. Users that deeply cared include:

@alexander-c-b, @patrick-kidger, @rsokl, and @skeggse all ~~prodded me with pain sticks~~ inspired me with feature requests for literally years that felt like an eternity in purgatory. They begged me to augment the beartype.door.is_bearable() tester into a full-blown typing.TypeGuard[T]. I thought it was impossible. They asked too much! Their increasingly despondent pleas fell on deaf ears – until at long last the Hero of Light emerged from the cavernous darkness of GitHub prophecy. And his username was...
@asford, who authored a full-frontal funny essay on this exact topic a month ago. Over the course of many, many paragraphs at which I chuckled drolly, @asford discovered how to augment the beartype.door.is_bearable() tester into a full-blown typing.TypeGuard[T]. How? By profanely combining typing.TypeGuard[T] + @typing.overload. Is it an utmost evil? It works. Thus, it can only be good:

# Note that this PEP 484- and 647-compliant API is entirely the brain child of
# @asford (Alex Ford). If this breaks, redirect all ~~vengeance~~ enquiries to:
#     https://github.com/asford
@overload
def is_bearable(
    obj: object, hint: Type[T], *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeGuard[T]:
    '''
    :pep:`647`-compliant type guard conditionally narrowing the passed object to
    the passed type hint *only* when this hint is actually a valid **type**
    (i.e., subclass of the builtin :class:`type` superclass).
    '''

@overload
def is_bearable(
    obj: T, hint: Any, *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeGuard[T]:
    '''
    :pep:`647`-compliant fallback preserving (rather than narrowing) the type of
    the passed object when this hint is *not* a valid type (e.g., the
    :pep:`586`-compliant ``typing.Literal['totally', 'not', 'a', 'type']``,
    which is clearly *not* a type).
    '''

This behaves itself under all Python versions – even Python 3.8 and 3.9, which lack typing.TypeGuard. How? By abusing typing.TYPE_CHECKING, of course. Is there anything typing.TYPE_CHECKING can't solve? Behold:

# Portably import the PEP 647-compliant "typing.TypeGuard" type hint factory
# first introduced by Python >= 3.10, regardless of the current version of
# Python and regardless of whether this submodule is currently being subject to
# static type-checking or not. Praise be to MIT ML guru and stunning Hypothesis
# maintainer @rsokl (Ryan Soklaski) for this brilliant circumvention. \o/
#
# Usage of this factory is a high priority. Hinting the return of the
# is_bearable() tester with a type guard created by this factory effectively
# coerces that tester in an arbitrarily complete type narrower and thus type
# parser at static analysis time, substantially reducing complaints from static
# type-checkers in end user code deferring to that tester.
#
# If this submodule is currently being statically type-checked (e.g., mypy),
# intentionally import from the third-party "typing_extensions" module rather
# than the standard "typing" module. Why? Because doing so eliminates Python
# version complaints from static type-checkers (e.g., mypy, pyright). Static
# type-checkers could care less whether "typing_extensions" is actually
# installed or not; they only care that "typing_extensions" unconditionally
# defines this type factory across all Python versions, whereas "typing" only
# conditionally defines this type factory under Python >= 3.10. *facepalm*
if TYPE_CHECKING:
    from typing_extensions import TypeGuard as TypeGuard
# Else, this submodule is currently being imported at runtime by Python. In this
# case, dynamically import this factory from whichever of the standard "typing"
# module *OR* the third-party "typing_extensions" module declares this factory,
# falling back to the builtin "bool" type if none do.
else:
    TypeGuard = import_typing_attr_or_fallback(
        'TypeGuard', TypeHintTypeFactory(bool))

I only vaguely understand what's happening there. If I understand correctly, acceptance of this PEP would enable @beartype to (A) dramatically simplify the above logic (e.g., by eliminating the need for @typing.overload entirely) and (B) dramatically enhance the utility of the is_bearable() tester by generalizing that tester to narrow arbitrary type hints.

After the release of Python 3.13.0, @beartype and all things like @beartype should now (A) globally replace all reference to typing.TypeGuard with typing.TypeIs, which is strictly superior for all practical intents and purposes ^{praise Jelle Zijlstra} and (B) refactor the signatures of things like is_bearable() to now resemble:

def is_bearable(
    obj: object, hint: TypeForm[T], *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeIs[T]:

David Foster be praised! I rejoice at this. The @beartype codebase will once again become readable. Well... more readable. Also, users are now weeping tears of joy at this. Type narrowing will start doing something useful for once. Yet questions remain.

The Demon Is In the Nomenclature: Name Haters Gonna Hate

The 100-pound emaciated gorilla in the room is actually your own open issue, @davidfstr:

I also added one Open Issue, whether the name “TypeForm” is best one to use

...heh. My answer is: "It's really not." I have no capitalist skin in this game. I barely know what a Typeform is. Yet, googling "python typeform" trivially yields nearly half-a-million hits. Googling "typeform" itself yields an astonishing 24 million hits – none of which have anything to do with typing systems and everything to do with TypeForm, the wildly successful tech startup I only marginally understand. Their search engine optimization (SEO) would probably frown and get a crinkled forehead if we trampled all over their heavily monetized brand space.

Out of sheer courtesy to Typeform, Typeform clients, and my rapidly shrinking 401k plan, ^...heh it's probably best that the CPython standard library not trample American capitalism. Leave that to the evening news.

Even if Typeform wasn't a thing, TypeForm still wouldn't necessarily be the best name. None of us associate type hints or annotations with "forms." Yeah, sure; it's an internal private implementation detail of the standard typing module that various public type hint factories leverage a private typing._SpecialForm superclass. Nobody's supposed to know about that, though. More importantly, everybody already cognitively associates "forms" with HTML- and JavaScript-driven web forms. When some dude wearing a pinstriped suit forces me to "...just fill out that friggin' TPES form, already!", I don't tend to think about type hints or annotations.

Maybe I should. Now I will. Great. Thanks a lot, @davidfstr. My cluttered mind now has even more material baggage to lug.

Oh, I Know. I Know! I've Got It. You're Just Gonna Love It. It's...

typing.TypeHint. :partying_face:

...heh. Who didn't see that one coming, huh? Seriously. typing.TypeHint. You know this is the name. You knew five paragraphs ago when I started rambling incoherently about American capitalism that it was all ramping up to this big climactic finale.

typing.TypeHint. :partying_face: :partying_face:

Likewise, let's consider globally replacing all usages of the corresponding term "form" throughout the PEP with "hint": e.g.,

# Instead of this, which makes my cross eyes squint even more than normal...
def isassignable[T](value: object, form: TypeForm[T]) -> TypeIs[T]: ...

# Let's do this! My wan and pale facial cheeks are now smiling.
def isassignable[T](value: object, hint: TypeHint[T]) -> TypeIs[T]: ...

Assignability Raisers and Sorters: So Those Are Things Too Now, Huh?

So. It comes to this. In the parlance of this PEP, the aforementioned is_bearable() tester is an "assignability tester." Cool. That's cool. But the beartype.door subpackage does a lot more than just testing assignability. beartype.door offers a joyous medley of general-purpose functions that operate on arbitrary type hints – including:

die_if_unbearable(). It's a lot like is_bearable(). Whereas is_bearable() returns a bool describing whether the passed value satisfies the passed type hint, die_if_unbearable() either:
- If that value satisfies that type hint, does nothing (i.e., returns None, silently reduces to a noop).
- If that value violates that type hint, raises a human-readable exception describing why.
is_subhint(). Now this is a cool one that has nothing to do with either die_if_unbearable() or is_bearable(). Yet, it'd be wonderful if static type-checkers and competing runtime type-checkers alike supported something similar. Basically, is_subhint() defines a partial ordering over the set of all type hints. Specifically, is_subhint() returns a bool describing whether the first passed type hint is a subhint of the second passed type hint – where "subhint" is defined as:
- These two hints are commensurable (i.e., convey broadly similar semantics enabling these two hints to be reasonably compared). For example:
- callable.abc.Iterable[str]`` andcallable.abc.Sequence[int]` are commensurable. These two hints both convey container semantics. Despite their differing child hints, these two hints are broadly similar enough to be reasonably comparable.
- callable.abc.Iterable[str]`` andcallable.abc.Callable[[], int]` are incommensurable. Whereas the first hint conveys a container semantic, the second hint conveys a callable semantic. Since these two semantics are unrelated, these two hints are dissimilar enough to not be reasonably comparable.
- The first hint is semantically equivalent to or narrower than the second hint. Formally:
- The first hint matches less than or equal to the total number of all possible objects matched by the second hint.
- The size of the countably infinite set of all possible objects matched by the first hint is less than or equal to that of those matched by the second hint.

In the same way that is_bearable() can be broadly thought of as a generalization of the isinstance() builtin, is_subhint() can be broadly thought of as a generalization of the issubclass() builtin. Examples or it only happened in the DMT hyperspace:

>>> from beartype.door import is_subhint

# Test simple subclass relations.
>>> is_subhint(bool, int)
True
>>> is_subhint(int, int)
True
>>> is_subhint(str, int)
False

# Test less simple type hint relations.
>>> from typing import Any
>>> is_subhint(list, Any)
True

# Test brutally hurtful type hint relations that make me squint. My eyes!
>>> from collections.abc import Callable, Sequence
>>> is_subhint(Callable[[], list], Callable[..., Sequence[Any]])
True
>>> is_subhint(Callable[[], list], Callable[..., Sequence[int]])
False

Is a partial ordering over the set of all types actually useful, though? I mean, sure. It's cool. We get that. Everything's cool if you squint enough at it. But does anyone care?

...heh. Yeah. It turns out a partial ordering over the set of all types unlocks the keys to the Kingdom of QA – including efficient runtime multiple dispatch in O(1) time. Think @typing.overload that actually does something useful. Does anyone want Julia without actually having to use Julia? Praise be to @wesselb.

In the case of die_if_unbearable(), integration between @beartype and static type-checkers via TypeHint would inform static type-checkers that the passed value is now guaranteed to satisfy the passed type hint. No intervening if conditionals are required: e.g.,

from beartype.door import die_if_unbearable

# Define something heinous dynamically. Static type-checkers no longer have any idea what's happening.
eval('muh_list = ["kk", "cray-cray", "hey, hey", "wut is going on with this list!?"])

# Beartype informs static type-checkers of that the type of "muh_list" is "list[str]".
die_if_unbearable(muh_list, list[str])

# Static type-checkers be like: "Uhh... I... I guess, bro. I guess. Seems wack. But you do you."
print(''.join(for muh_item in muh_list))  # <-- totally fine! accept this madness, static type-checker

"TypeForm" Values Section: Not Sure What's Going On Here, But Now Squinting

The "TypeForm" Values section makes me squint. From @beartype's general-purpose broad-minded laissez faire "anything goes" perspective, anything that is a type hint should ideally be a TypeHint.

This includes type hints that are only contextually valid in various syntactic and/or semantic contexts – like Final[...], InitVar[...], Never, NoReturn, Self, TypeIs[...] and so on. Ultimately, the line between whether a type hint is globally valid or only contextually valid is incredibly thin. From @beartype's permissive perspective, for example, Final[...], InitVar[...], and Self type hints are all syntactically and semantically valid anywhere within the body of a class – which covers most real-world code, because most real-world code is object-oriented and thus resides within the body of a class. You're probably thinking: "Wait. What? How is InitVar[...] valid as the parameter of a method?" Look. It's complicated. Just know that subtle runtime interactions between @beartype and @dataclasses.dataclass require @beartype to look the other way while @dataclasses.dataclass rummages around in dunder methods like __init__() behind everyone's backs.

The PEP currently rejects these sorts of type hints as "annotated expressions" – which is itself really weird, because we already have annotated type hints that are technically Python expressions and thus "annotated expressions": typing.Annotated[...]. When you say "annotated," I think: "Surely you speak of typing.Annotated[..], good Sir!" I do not think: "Surely you speak of arbitrary type hints that are only contextually valid in various syntactic and/or semantic contexts, less good Sir!"

The problem with rejecting some but not all type hints is:

Why? Why bother? No justification is given. If @beartype is fine with literally all type hints, everybody is fine. Unless everybody hates @beartype. Then the @leycec emoji sobs. :sob:
The PEP doesn't even enumerate the set of all rejected type hints. It's way more than just Self, TypeGuard[...], and TypeIs[...].
Even enumerating the set of all rejected type hints is pointless, because the set of all rejected type hints explodes exponentially with each subsequent Python release. Today, it's ten type hint factories or whatevah. In ten years, it's ten thousand type hint factories after the inevitable release of Python 3.923480713084723407.0 introduces an earth-shattering gamut of new type hint factories that are only contextually valid in PEP 526-compliant annotated variable assignments. What about those, huh!? Those poor guys. More sobbing can be heard. :sob:

Stringified TypeForms Section: NO GODS WHY NOOOOOOOOOOOOOOOOOOOOO

A type-form value may itself be a string literal that spells a forward reference:

...heh. So. It comes to this. You're trying to commit @leycec to a sanitarium. The truth is now revealed. Please. Let's all be clear on this:

Any attempt to declare strings as type hints should be soundly rejected as insane.

Static type-checkers don't care about stringified type hints, of course. But static type-checkers also hallucinate. By definition, their opinions are already insane.

Runtime type-checkers, however, basically cannot cope with stringified type hints – like, any stringified type hints. In the general case, doing so requires non-portable call stack inspection. It's slow. It's fragile. It's non-portable. It basically never works right. Even when it works "right," it never works the way users expect.

Sure. @beartype copes with stringified type hints – mostly. But @beartype is also insane. @beartype has already squandered years of sweaty blood, smelly sweat, unpaid man hours, and precious life force attempting to support insane shenanigans like PEP 563 (i.e., from __future__ import annotations) and PEP 695 (e.g., type YouTooShallKnowOurPain = 'wut_u_say' | 'die_beartype_die'). Everybody else in the runtime type-checking space just gave up and didn't even bother trying.

In 2024, with the benefit of hindsight and safety goggles, let us all quietly admit that stringified type hints were a shambolic zombie plague that should have never happened. We certainly shouldn't be expanding the size and scope of stringified type hints. We should be deprecating, obsoleting, and slowly backing away from stringified type hints with our arms placatingly raised up in the air as we repeatedly say: "We're sorry! We're sorry for what we did to you, Pydantic and @beartype and typeguard! We didn't know... Gods. We didn't know. The horrors your codebase must have seen. Please accept this hush money as compensation."

There's no demonstrable reason whatsoever to permit useless insanity like IntTreeRef: TypeForm = 'IntTree' # OK. NO, NOT OK. Absolutely not OK. If this comment has one and only takeaway, let it be this:

Pythonistas don't let Pythonistas stringify type hints. Not even once. — thus spake @leycec

@leycec: He Is Now Tired and Must Now Collapse onto a Bed Full of Cats

Tinche commented 4 months ago

LGTM. Excited about this!

agronholm commented 4 months ago

Yeah, just today I ran into a problem that would probably be solved with this. And it wasn't even related to typeguard. The PEP looks good at a glance, but I'll give it a proper read-through later.

patrick-kidger commented 4 months ago

A broad +1 to all of @leycec's points.

I don't love how this is proposing to create a three-tier hierarchy of 'types' and 'types forms' and 'type hints'. The existing split is complicated enough! Better to call this TypeHint and accept anything. Even Self and Final etc. have to be handled by runtime type checkers.
Definitely let's not accept stringified annotations. Every runtime type-checker I write either loudly fails or silently turns those into Any, because they're basically impossible to handle well at runtime. (And once PEP649 is implemented then hopefully this issue can go away.)

davidfstr commented 4 months ago

Name

Googling "typeform" itself yields [...] TypeForm, the wildly successful tech startup I only marginally understand.

Heh. True. "Typeform" is much better known as a service for online surveys. :)

So, definitely a +1 that there's almost certainly a better name then "TypeForm". However choosing such a name depends a lot on your next point:

Values

From https://github.com/beartype's general-purpose broad-minded laissez faire "anything goes" perspective, anything that is a type hint should ideally be a TypeHint.

This includes type hints that are only contextually valid in various syntactic and/or semantic contexts – like Final[...], InitVar[...], Never, NoReturn, Self, TypeIs[...] and so on.

I've debated whether the TypeForm concept should cover all runtime type annotation objects (i.e. what the typing specification calls "annotation expressions") or only those objects which spell a "type" (i.e. what the typing specification calls "type expressions").

Allowing TypeForm[] to match non-types (like InitVar[], Final[], Self, etc) doesn't make sense when trying to combine TypeForm[] with TypeIs[] or TypeGuard[] in a function definition, one of the key capabilities I want to enable. I discuss this further in §"Rejected Ideas > Accept arbitrary annotation expressions". Consider the following code:

# AKA: is_bearable
def isassignable[T](value: object, form: TypeForm[T]) -> TypeIs[T]: ...

request_json = ...
if isassignable(request_json, Final[int]):
    assert_type(request_json, ???)  # Never? int? Certainly not Final[int] because not valid for a variable type.

What should a static type checker infer for the ??? position above? (Pause to consider your own answer here...)

Right now the PEP takes the stance that passing a non-type of Final[] where a TypeForm[] is expected is an error. So ??? would be Any (i.e. the error type).

Surprisingly, I see that is_bearable (an implementation of isassignable from beartype) can return True in the above scenario...

>>> from beartype.door import is_bearable
>>> from typing import *
>>> is_bearable(5, Final[int])
True  # 😳

Appendix: More adventures in beartype

Happily I see Self is rejected outright (👍 ):

>>> is_bearable(5, Self)
beartype.roar.BeartypeDecorHintPep673Exception: Is_bearable() PEP 673 type hint "typing.Self" invalid outside @beartype-decorated class. PEP 673 type hints are valid only inside classes decorated by @beartype.

And ClassVar is unsupported (👌 ):

>>> is_bearable(5, ClassVar[int])
beartype.roar.BeartypeDecorHintPepUnsupportedException: Is_bearable() type hint typing.ClassVar[int] currently unsupported by @beartype.

But InitVar is accepted, surprisingly:

>>> from dataclasses import InitVar
>>> is_bearable(5, InitVar[int])
True  # 😳

Appendix: Similar adventures in trycast

In the trycast library, Final, Self, ClassVar, and InitVar are all unsupported, since they don't make sense when looking at a value in isolation:

>>> from trycast import isassignable
>>> isassignable(5, Final[int])
trycast.TypeNotSupportedError: isassignable does not know how to recognize generic type typing.Final.
>>> isassignable(5, Self)
TypeError: typing.Self cannot be used with isinstance()
>>> isassignable(5, ClassVar[int])
trycast.TypeNotSupportedError: isassignable does not know how to recognize generic type typing.ClassVar.
>>> isassignable(5, InitVar[int])
TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union

(Heh. I especially need to fix that last error message to be something sensible.)

Stringified TypeForms

Runtime type-checkers, however, basically cannot cope with stringified type hints – like, any stringified type hints. In the general case, doing so requires non-portable call stack inspection. It's slow. It's fragile. It's non-portable. It basically never works right. Even when it works "right," it never works the way users expect.

let us all quietly admit that stringified type hints were a shambolic zombie plague that should have never happened. We certainly shouldn't be expanding the size and scope of stringified type hints. We should be deprecating, obsoleting, and slowly backing away from stringified type hints

Agreed that stringified TypeForms - where the entire type is a string, not just some interior forward references - are very difficult to work with at runtime. I allude to this in §"How to Teach This", but I thought I had used stronger language than what I now see: 😉

Stringified type annotations[^strann-less-common] (like 'list[str]') must be parsed (to something like typing.List[str]) to be introspected.

Resolving string-based forward references[^strann-less-common] inside type expressions to actual values must typically be done using eval(), which is difficult/impossible to use in a safe way.

(Note to self: Increase emphasis in the PEP RE how difficult it is to work with stringified type annotations at runtime.)

The current PEP draft defaults to allowing stringified TypeForms since static type checkers already expect & handle them robustly in locations where a type expression can appear. But - upon further thought - anything that can fit into a TypeForm must be capable of being well-supported both by static and runtime type checkers in order to spell an implementable function definition.

So I'm inclined to agree that TypeForms probably shouldn't allow stringified annotations since they're basically impossible to work with robustly at runtime.

Edit: I changed my mind RE not allowing stringified annotations to be matched, to prioritize aligning with matching all "type expressions" (which include them).

patrick-kidger commented 4 months ago

What should a static type checker infer for the ??? position above? (Pause to consider your own answer here...)

I have a possibly-controversial suggestion (that I don't feel too strongly about right now), which is that this isn't defined behaviour. For example, if I fill in the type hint explicitly with pyright (which is what I have installed at the moment), then we get the perfectly meaningless:

from typing import Final, TypeGuard

def foo(x) -> TypeGuard[Final[int]]:
    pass

x = 1
if foo(x):
    reveal_type(x)  # Type of `x` is `Final`.

To expand on this, the proposed TypeForm-that-isn't-TypeHint feels to me a bit like "the type of all positive integers". At some point we make the jump from properties we care to express in the type system to properties we don't. I feel like "the set of all valid parameters T for TypeIs[T]" is probably already substantially more niche than "the type of all positive integers" -- in fact I suspect the latter would see quite a lot more use-cases! -- but we don't implement that.

JelleZijlstra commented 4 months ago

I would argue that the types allowed by TypeForm should exactly match the definition of either "type expression" or "annotation expression" in the spec (https://typing.readthedocs.io/en/latest/spec/annotations.html#type-and-annotation-expressions). This reduces the number of concepts and makes the overall system simpler. Possibly we should add both, which suggests obvious names for the new special forms: AnnotationExpression[T] and TypeExpression[T].

I don't think it is practical to disallow stringified annotations. Consider a type alias from a third party library that is defined as Alias = list[int]. You can use Alias as a TypeForm. If the library now changes to Alias = list["int"], does that mean Alias is no longer valid as a TypeForm? Similarly, if we disallow Self, should we also disallow list[Self]?

There will always be some types that are hard for a runtime type checker to check. For example, is_assignable(some_generator(), Generator[int, str, float]) would be impossible to fully check without analyzing the bytecode of the generator.

To @patrick-kidger's example, pyright correctly shows an error on the TypeGuard[Final[int]] line, because TypeGuard[...] requires a type expression and Final[int] isn't one. I don't think you can draw a conclusion from pyright's behavior on the rest of an invalid program.

TeamSpen210 commented 4 months ago

To me, it feels like we should tend towards being as loose as possible with what is permitted as a TypeForm. Anything using it is going to have restrictions at runtime, in ways that couldn't possibly be easily expressed. Users are going to have to check documentation/rely on runtime exceptions to know what is allowed, so restricting a few specific cases doesn't help too much?

For Final, ClassVar and InitVar, the rule a static type checker could use is to simply strip them off before evaluating the guard. These in particular I can see uses for, to do things like match specific configurations of like a dataclass field object. Maybe for things like Self that don't meaningfully interact with TypeGuard/TypeIs, there should just be a type error at the point where you call such a function with such a variable, and no narrowing occurs.

davidfstr commented 4 months ago

Values

Several folks have recommended not bifurcating the existing concepts of "annotation expressions" and "type expressions" to a further third subset, and to instead just pick one of the first two.

Since the main utility of TypeForm[] is using it in combination with TypeIs[] + TypeGuard[], and because a "type expression" is what those forms accept, I'm inclined to round the concept of TypeForm to exactly match a "type expression".

Name

With the above meaning defined for the concept, I'm looking at renaming TypeForm[] to TypeExpression[].

There may be a desire to define a separate concept that aligns with "annotation expressions", perhaps called AnnotationExpression[], but I don't think it's valuable to define in this PEP. I don't see any benefits in being able to spell AnnotationExpression[] vs just spelling object, as you currently must do.

Stringified TypeForms

By rounding the concept of TypeForm[] to exactly match a "type expression", that would imply that stringified annotations like 'list[int]' would be allowed. Despite being allowed, runtime type checkers cannot handle them reliably at runtime. However this is not a unique problem: there are a number of type expressions - Generator..., Callable..., stringified annotations 3 - that are particularly hard to work with at runtime already.

Perhaps it would be sufficient in §"How to Teach This":

to mention that it's not expected that runtime type checkers necessarily handle every possible kind of TypeExpression[] input, which is already the status quo, and
to acknowledge certain specific kinds of type expressions which have been difficult to work with.

New idea: Matching TypeExpressions[] with an ABC?

@erictraut has expressed concern that it would be difficult for a static type checker like pyright to match TypeExpressions[]s in locations that would normally accept only a regular value expression. ^type-vs-value-expression

Static type checkers already have to deal with recognizing ABCs, so I wonder if defining a TypeExpression[] as an ABC would make it easier for a static type checker to recognize...

A quick proof of concept:

>>> from abc import ABC
>>> from typing import *
>>> import typing
>>> 
>>> class TypeExpression(ABC):
...     pass
>>> 
>>> type(str)
<class 'type'>
>>> TypeExpression.register(type)
>>> 
>>> type(Union[int, str])
<class 'typing._UnionGenericAlias'>
>>> TypeExpression.register(typing._UnionGenericAlias)
>>> 
>>> isinstance(str, TypeExpression)
True
>>> isinstance(Union[int, str], TypeExpression)
True

python / mypy