Type mappings for generics

python / typing

Python static typing home. Hosts the documentation and a user help forum.

https://typing.readthedocs.io/

Other

1.59k stars 234 forks source link

Type mappings for generics #1273

Open jonathanslenders opened 2 years ago

jonathanslenders commented 2 years ago

(originally proposed here https://github.com/python/mypy/issues/13870 , but it turns out this is a better place to discuss)

Feature

Assume we have a mapping between types, possibly because there's a method with many overloads that takes an object from the first collection of types, and maps it into an object from the second collection of types.

Then, assume we have a generic class for which we have a TypeVar that corresponds with the first type. Then there is a method in this generic that does the transformation and produces a corresponding type.

Right now, it would be possible to type the outcome as a union, but much better would be to statically infer the output type.

TypeLookupTable = TypeMapping('TypeLookupTable', {
    FromType1: ToType1,
    FromType2: ToType2,
    FromType3: ToType3,
    ...
})

T = TypeVar('T', bound=any_type_present_as_a_key_in_that_lookup_table)

@dataclass
class SomeGeneric(Generic[T]):
     data: T

     def func(self) -> TypeLookupTable[T]:
         # This function should produce an instance of the corresponding type of `T` in
         # that lookup table.

    def func2(self) -> ReturnTypeForFunctionCall[some_function, [T]]:
        # This function returns an object for which the return type is
        # inferred by looking at the overload of `some_function`
        # when called  with input `T`.
        return some_function(self.data)

Pitch

I'm building a synchronous abstraction layer on top of an async library. This means every type from the async implementation will correspond to a type from the synchronous abstraction. There is one function wrap with tons of overloads that takes an async type and returns the corresponding sync type. (it also takes an anyio.BlockingPortal, but that's not relevant for this issue). There are many approaches for the implementation, the dictionary mapping being the easiest, because runtime introspection of the types in the dictionary is the easiest.

Right now, I can't come up with other examples for situations where it's useful, but I'm sure there are other cases where a generic class calls any of a collection of function overloads and where we want to infer the corresponding return type.

Among the two demonstrated approaches, (ReturnTypeForFunctionCall and TypeLookupTable), I'm mostly in favor of TypeLookupTable, because a collection of overloads can always be expressed using that lookup table. Something like:

def function_with_many_overloads[T](data: T) -> TypeLookuptable[T]: ...

(That would expand to as many overloads as there are mapped types in that table.)

From the mypy issue, I understand that this would be very useful for Numpy too. Possibly related thread: https://mail.python.org/archives/list/typing-sig@python.org/thread/VGBBY63CUV7LTBDIIXDPYK3OWTQTUN3Y/#KL3VLJDJM5WGLBXWUZHOZG5PMGY2MFWQ

erictraut commented 2 years ago

I don't think the suggested approach (using a lookup table) would compose well with existing type features. Keep in mind that static type checkers don't actually execute code. They are designed to evaluate the type of an expression, not the value of an expression. A lookup table is an expression whose value must be understood to be of any utility.

So I don't think this proposal would work as suggested. There are potentially other options we could explore if this problem is sufficiently important to solve. My sense is that this is a pretty atypical pattern though, so you would need to make a strong case to convince the typing community that an extension to the Python type system is merited to handle this case.

There is a way to do this with no new type features using constrained TypeVars and overloads. This approach is admittedly a bit cumbersome, but it gets the job done.

from dataclasses import dataclass
from typing import Any, Generic, TypeVar, overload

class FromType1: ...
class FromType2: ...
class FromType3: ...
class ToType1: ...
class ToType2: ...
class ToType3: ...

T = TypeVar("T", FromType1, FromType2, FromType3)

@dataclass
class SomeGeneric(Generic[T]):
    data: T

    @overload
    def func(self: "SomeGeneric[FromType1]") -> ToType1: ...
    @overload
    def func(self: "SomeGeneric[FromType2]") -> ToType2: ...
    @overload
    def func(self: "SomeGeneric[FromType3]") -> ToType3: ...
    def func(self) -> Any: ...

reveal_type(SomeGeneric(FromType1()).func())  # ToType1
reveal_type(SomeGeneric(FromType2()).func())  # ToType2
reveal_type(SomeGeneric(FromType3()).func())  # ToType3

I confirmed this works with mypy and pyright.

betafcc commented 2 years ago

Stumbled on this, feel like I have something to add, sorry if I missunderstood the use case.

From the feature part of the description I got the feeling that the use case is well served by either just having a protocol, if you have a common transaction type (being that generic or not):

class Iso(Protocol[A]):
    @abstractmethod
    @classmethod
    def decode(cls, value: A) -> Self: ...
    def encode(self) -> A: ...

class Type1(Iso[A]):
    @classmethod
    def decode(cls, value: A) -> Type1: ...
    def encode(self) -> A: ...

Think of how you can abstract any sync collection type by using Iterable[A] as a channel.

But if not, the 'overload lookup' concept really looks like just a well-typed functools.singledispatch, which I definitely would love to have official support as well, @erictraut solution can be simplified to:

from functools import singledispatch

@singledispatch
def func(arg):
    raise NotImplementedError

@func.register
def _(data: Type1) -> ToType1: ...
@func.register
def _(data: Type2) -> ToType2: ...
.
.
.

jonathanslenders commented 2 years ago

The problem with the overloads (or singledispatch), is that you have to repeat them over and over and over and over again for every place where they are wrapped:

Imagine we have these overloads:

@overload
def func(a: Type1) -> ToType1: ...

@overload
def func(a: Type2) -> ToType2: ...

Imagine now that we have another function that wraps these overloads in some way (like a decorator). With unions, that becomes:

def func2(a: Type1 | Type2) -> ToType1 | ToType2:
    return func(a)

But if we want to type inference to work, it becomes:

@overload
def func2(a: Type1) -> ToType1: ...
@overload
def func2(a: Type2) -> ToType2: ...

Of course, in practice, the signature is maybe not exactly the same, func2 could maybe take an additional input, and do something beyond calling func. Important is that func2 calls func within, which does the actual mapping.

If the type mapping can be statically defined, then it's reusable for all places where it's used. Maybe something like this is a better notation:

MyMapping = TypeMapping('MyMapping', {
    Type1: ToType1,
    Type2: ToType2,
})
T = TypeVar('T', bound=MyMapping)

def func(a: T) -> MyMapping[T]: ...

def func2(a: T) -> MyMapping[T]: ...

Imagine there are N places where such a generic function is defined, and imagine there are M types that are mapped. Having to type it by hand means typing N*M @overloads. That doesn't scale.

@erictraut: I know that type checkers don't execute code. It doesn't mean this can't be done. @betafcc: I don't understand how a Protocol could help me in this case. (I'm familiar with the way protocols work, but I don't see it in this case.) Hopefully the above explanation helps. @erictraut : It's a bit atypical indeed, but from the Mypy issue, I understood that Numpy data type conversion rules are one big use case for this. So, I don't think it's that far fetched.

erictraut commented 2 years ago

You've come to the right place to discuss this, as this forum is frequented by experts within the Python typing community — including the maintainers of the major static type checkers, runtime type checkers, and typeshed stubs. (BTW, I'm the main author of pyright, Microsoft's open-source static type checker.)

The proposed solution is problematic for the reasons I mentioned, so it's likely to meet resistance from the Python typing community in its current form. There are likely better solutions to the problem, but before we explore alternatives, I think we'd need to have a deeper understanding of the problem and the family of use cases that are affected by the problem. There would also need to be good evidence that this problem is sufficiently common and widespread to justify a generalized solution in the form of an extension to the Python type system.

Perhaps you can back up and try to describe the problem in a more general manner? You mentioned above that the overload approach is verbose. I agree with that. So is the main problem that it's too verbose and you're looking for an alternative — but more concise — way to express the same thing? Or is the problem that there is currently no way to express something in the type system, verbose or otherwise? Creating redundant ways to do something is usually a bad thing, but it can be justified if the new (more concise) form provides sufficient savings for a sufficient number of developers. It's a high bar to clear though, and it's best if you can back it up with real statistics from existing code bases.

betafcc commented 2 years ago

@jonathanslenders was just a suspicion I had from the code alone, the from, to function pairs makes it look like you have a common transaction type, if was the case, defining the protocol methods on the types themselves felt more standard.

For the rest, including your last comment, the 'overload lookup' concept really looks like the @singledispatch that I've been wanting, only difference being is the mapping stays in a mappingproxy inside the function instead of a literal dictionary, I would +1 the feature in either syntax, just want to overload in a terser way

EDIT: just read more carefully your func2 example, indeed, gets annoying with either overload or singledispatch, specially given "ParamSpec cannot be used with overloaded function"

ps: prompt toolkit user myself, just noticed you're the author, awesome

ps2: Not related but was attracted by

I'm building a synchronous abstraction layer on top of an async library. This means every type from the async implementation will correspond to a type from the synchronous abstraction

Had to do some utils to abstract over Iterables/AsyncIterables/Awaitables/Observables recently, for data-wrangling that need to mix async/sync and parallel/linear pipelines, while handling retries and exceptions in a sane manner. So also did overloaded wrap function but probably in a different approach. But now I'm able to compose data-processing functions in normal top-level non-async code context and use it in any of the above.

Don't know what exactly you working but damn the takes an object from the first collection of types, and maps it into an object from the second collection of types sounds to me like natural transformation of functors diagram put into words so who knows maybe there is something of some use:

Have you considered abstracting over functor instead? My wrap function is just a fmap, and wraps function instead of types, so I can convert any normal A -> B function into a Collection[A] -> Collection[B] one, plus with 'applicative functors' I can do operations like turning list[Awaitable[A]] into Awaitable[list[A]] while running all in paralel using the standard one liners of those FP stuff (with mixed type inference results thought)

In any case I feel that higher kinded typevars is related, both to mine and your approach. With that is possible to create a Codec HKT dictionary, and keep the type info on application, or to make fp higher order compositions while keeping type annotions to a minimum

hmc-cs-mdrissi commented 2 years ago

I'll add some details for numpy like use cases with full example. Numpy use case appears for any array library that has data type promotion rules.

There are many array functions that take 2 or more arrays as input. The resulting array's data type often depends on data types of inputs. Easiest case being binary operations. A simplified version of numpy's data type rules only looking at 3 types (float32, float64, int32) we get overloads like,

DType = TypeVar('DType', np.float32, np.float64, np.int32)

class Array(Generic[DType]):
  @overload
  def __add__(self:  Array[np.float32], other: Array[np.int32]) -> Array[np.float32]: ...
  @overload
  def __add__(self:  Array[np.float32], other: Array[np.float32]) -> Array[np.float32]: ...
  @overload
  def __add__(self:  Array[np.float32], other: Array[np.float64]) -> Array[np.float64]: ...
  @overload
  def __add__(self:  Array[np.float64], other: Array[np.int32]) -> Array[np.float64]: ...
  @overload
  def __add__(self:  Array[np.float64], other: Array[np.float32]) -> Array[np.float64]: ...
  @overload
  def __add__(self:  Array[np.float64], other: Array[np.float64]) -> Array[np.float64]: ...
  @overload
  def __add__(self:  Array[np.int32], other: Array[np.int32]) -> Array[np.int32]: ...
  @overload
  def __add__(self:  Array[np.int32], other: Array[np.float32]) -> Array[np.float32]: ...
  @overload
  def __add__(self:  Array[np.int32], other: Array[np.float64]) -> Array[np.float64]: ...

That's for one function. Now what both multiply? Almost exact same lines,

DType = TypeVar('DType', np.float32, np.float64, np.int32)

class Array(Generic[DType]):
  @overload
  def __mul__(self:  Array[np.float32], other: Array[np.int32]) -> Array[np.float32]: ...
  @overload
  def __mul__(self:  Array[np.float32], other: Array[np.float32]) -> Array[np.float32]: ...
  @overload
  def __mul__(self:  Array[np.float32], other: Array[np.float64]) -> Array[np.float64]: ...
  ...

Other functions like radd, sub, etc have similar overloads. And there are also functions that may take other arguments where you just need to repeat those arguments repeatedly.

The type lookup table equivalent would be something like,

ResultDataType = {
  (np.int32, np.int32): np.int32, (np.int32, np.float32): np.float32, (np.int32, np.float64): np.float64, 
  (np.float32, np.int32): np.float32, (np.float32, np.float32): np.float32, (np.float32, np.float64): np.float64,
  (np.float64, np.int32): np.float64, (np.float64, np.float32): np.float64, (np.float64, np.float64): np.float64
}

DType = TypeVar('DType', np.float32, np.float64, np.int32)
DType2 = TypeVar('DType2', np.float32, np.float64, np.int32)

class Array(Generic[DType]):
  def __add__(self, other: Array[DType2]) ->  ResultDataType[DType, DType2]: ...
  def __radd__(self, other: Array[DType2]) ->  ResultDataType[DType, DType2]: ...
  def __mul__(self, other: Array[DType2]) ->  ResultDataType[DType, DType2]: ...
  def __rmul__(self, other: Array[DType2]) ->  ResultDataType[DType, DType2]: ...
  ...

This would be defined to be equivalent to writing overloads manually and would be intended only as a way to making reading/maintaining families of related overloads easier. It should not introduce any new power to the system and replacing the function with earlier overloads should be considered a valid to check it.

As a comparison here is a link to the real overloads that numpy uses. __sub__ and __rsub__ both have 11 overloads that look the same and need to be duplicated/kept in sync manually. __and__, __rand__, __or__, __xor__, etc each have 5 overloads that appear to be same family each time. My guess is numpy/scipy codebase you could collect statistics on number of overloads and how many follow family structure. I'm unsure if pytorch/cupy/etc handle dtype overloads but they should have similar potential use case. Tensorflow is an exception where it has very simple data type rules (data types usually must match, no conversion allowed).

The exact syntax I don't have a strong preference on. It'd be nice convenience if overload families could be more readable, but at least it's expressible in type system today. One solution without touching the type system is to have a stub template file and then have a tool that generates stubs from it.

jonathanslenders commented 1 year ago

@betafcc: Thanks for your reply! I missed it when you posted it. These are really useful insights. I'm not that familiar yet with higher kinded typevars and functors. It could be very well possible that it's expressible with HKT, I'll look into that. So maybe that is what we'd need instead.

@hmc-cs-mdrissi : Thanks for chiming in and providing the Numpy examples!

randolf-scholz commented 1 year ago

TypeScript seems to have Mapped Types specifically for this purpose.

matangover commented 1 year ago

TypeScript's Mapped Types don't seem to solve the same problem (they create a new type based on the keys of a given type). But Conditional Types do solve this.

SimpleArt commented 1 year ago

It looks to me like conditional types do not add much here other than more verbose syntax and potentially a few cases where you could sneak in an if sys.version_info ... and similar such specialized conditions, so I would prefer to have some sort of mapping type.

For the naming, I think it should be typing.TypeMapping instead of typing.MappingType since TypeMapping clearly sounds like it maps types whereas MappingType sounds like type(mapping).

For the syntax, I believe an explicit typing.TypeMapping needs to be used, a literal dictionary will probably lead to confusions. The corresponding type variables should not be constrained to the TypeMapping in any way, instead it should be completely independent.

It is also apparent that this often happens on more than one argument simultaneously, so the TypeMapping should support Dict[type, type] or Dict[tuple[type, ...], type].

Example:

Classes = TypeMapping("Classes", {Literal["A"]: A, Literal["B"]: B, Literal["C"]: C})

# Type variables may be comprised of narrower types.
AB = TypeVar("AB", bound=Literal["A", "B"])

def foo(name: AB) -> Classes[AB]:
    ...

I don't think that allowing different amounts of arguments in a single type mapping makes any sense i.e. only one of M[T], M[T1, T2], M[T1, T2, T3], etc. should be possible.

Type variables do not necessarily need to be used for indexing mapping types. For example, one might wish to partially use a type mapping such as M[T, int].

Along those lines, allowing parameterized type mappings similar to generics would be a nice-to-have:

Overall = TypeMapping("Overall", {(A, B): ...})
Partial = Overall[T, int]  # TypeMapping("Partial", {(A, int): ...})
Partial[bool]  # Overall[bool, int]

Having said all of that, there is another less general syntax that I think some may like:

M = TypeMapping(
    "M",
    name=(Literal["A"], Literal["B"], Literal["C"]),
    cls=(A, B, C),
)

def foo(name: M.name) -> M.cls:
    ...

The advantage is that there's no need to create separate type variables and some level of semantic meaning can be provided (I think it makes sense to say the type variables in a type mapping are often related somehow). The disadvantage is that you lose the generality of the other approach.

SimpleArt commented 1 year ago

One use-case I have had where it was not possible to cleanly emulate the desired behavior was with generic instance variables:

from dataclasses import dataclass
from typing import Generic, TypeMapping, TypeVar

DT = TypeMapping("DT", {A1: A2, B1: B2})
T = TypeVar("T", A1, B1)

@dataclass
class Foo(Generic[T]):
    x: DT[T]

The only workarounds that I see are to:

Implement an overloaded descriptor (new).
Add an extra type variable (relies on users to use the correct types together and linters may complain if incorrect type combinations cannot be combined).
Use a union (linters may complain if incorrect type combinations cannot be combined).

jonathanslenders commented 1 year ago

Adding another, somewhat different use case. Imagine a function for filtering a list of data in a database abstraction layer:

@dataclass
class Person:
    name: str
    age: int

people: list[Person] = [Person("john", 21), ...]

# Query
matching_people = filter(people, key=by_field("name", "john"))

The type of by_field should be something that checks the type of the filtered value to the type of the filtered field:

person_field_types = TypeMapping("mapping",
    "name": str,
    "age": int,
})
def by_field(field: T, value: person_field_types[T]) -> Callable[[Person], bool]:
    ...

Even better in this case would be in person_field_types could be derived from the class itself.

akpircher commented 1 year ago

Since someone else brought up the literal-string to class type scenario, I have a situation where I need to map literal strings to a class type, and the only really generic solution I have is to create a function with several overloads that performs the mapping for me. This is, broadly, what it looks like:

from __future__ import annotations

from typing import Mapping, overload

from typing_extensions import Literal

class A: pass
class B: pass
class C: pass

TypeMap: Mapping[Literal["a", "b", "c"], type[A | B | C]] = {"a": A, "b": B, "c": C}

@overload
def map_types(*__types: Literal["a"]) -> tuple[type[A], ...]:
    ...

@overload
def map_types(*__types: Literal["c"]) -> tuple[type[C], ...]:
    ...

@overload
def map_types(*__types: Literal["b"]) -> tuple[type[B], ...]:
    ...

@overload
def map_types(*__types: Literal["b", "c"]) -> tuple[type[B | C], ...]:
    ...

@overload
def map_types(*__types: Literal["a", "c"]) -> tuple[type[A | C], ...]:
    ...

@overload
def map_types(*__types: Literal["a", "b"]) -> tuple[type[A | B], ...]:
    ...

@overload
def map_types(*__types: Literal["a", "b", "c"]) -> tuple[type[A | B | C], ...]:
    ...

def map_types(*__types: Literal["a", "b", "c"]) -> tuple[type[A | B | C], ...]:
    return tuple(TypeMap[type_t] for type_t in sorted(set(__types)))

However, if I want it to work with mypy, without ignores and solving the overload overlap error, OR if I want to be precise, it gets significantly longer:

Collapsed, because it's long.

```python from __future__ import annotations from typing import Mapping, overload from typing_extensions import Literal class A: pass class B: pass class C: pass TypeMap: Mapping[Literal["a", "b", "c"], type[A | B | C]] = {"a": A, "b": B, "c": C} @overload def map_types(*__types: Literal["a"]) -> tuple[type[A]]: ... @overload def map_types(*__types: Literal["b"]) -> tuple[type[B]]: ... @overload def map_types(*__types: Literal["c"]) -> tuple[type[C]]: ... @overload def map_types( __types1: Literal["a"], __types2: Literal["b"], *__types: Literal["a", "b"] ) -> tuple[type[A], type[B]]: ... @overload def map_types( __types1: Literal["b"], __types2: Literal["a"], *__types: Literal["a", "b"] ) -> tuple[type[A], type[B]]: ... @overload def map_types( __types1: Literal["a"], __types2: Literal["c"], *__types: Literal["a", "c"] ) -> tuple[type[A], type[C]]: ... @overload def map_types( __types1: Literal["c"], __types2: Literal["a"], *__types: Literal["a", "c"] ) -> tuple[type[A], type[C]]: ... @overload def map_types( __types1: Literal["b"], __types2: Literal["c"], *__types: Literal["b", "c"] ) -> tuple[type[B], type[C]]: ... @overload def map_types( __types1: Literal["c"], __types2: Literal["b"], *__types: Literal["b", "c"] ) -> tuple[type[B], type[C]]: ... @overload def map_types( __types1: Literal["a"], __types2: Literal["b"], __types3: Literal["c"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... @overload def map_types( __types1: Literal["a"], __types2: Literal["c"], __types3: Literal["b"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... @overload def map_types( __types1: Literal["b"], __types2: Literal["a"], __types3: Literal["c"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... @overload def map_types( __types1: Literal["b"], __types2: Literal["c"], __types3: Literal["a"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... @overload def map_types( __types1: Literal["c"], __types2: Literal["a"], __types3: Literal["b"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... @overload def map_types( __types1: Literal["c"], __types2: Literal["b"], __types3: Literal["a"], *__types: Literal["a", "b", "c"], ) -> tuple[type[A], type[B], type[C]]: ... def map_types(*__types: Literal["a", "b", "c"]) -> tuple[type[A | B | C], ...]: return tuple(TypeMap[type_t] for type_t in sorted(set(__types))) ```

Which is, about, 119 lines longer than it needs to be (this was the final nail in the coffin that got me to switch from mypy to pyright). It's a bit annoying knowing that the same code could be done in TypeScript in 17 or fewer lines

class A {}
class B {}
class C {}

// lookup way
interface MapType {
  a: typeof A;
  b: typeof B;
  c: typeof C;
}
type Names = keyof MapType;
const Lookup: MapType = {
  a: A,
  b: B,
  c: C,
};
function mapTypes<T extends Names>(...names: T[]): MapType[T][] {
  return [...new Set(names)].map((shortName) => Lookup[shortName]);
}

// conditional way
type NameToType<T extends 'a' | 'b' | 'c'> = T extends 'a'
  ? A
  : T extends 'b'
  ? B
  : T extends 'c'
  ? C
  : never;
function someDependentFunctionThatUsesGeneric<R>(parameter: R): void {}
function aSmartGenericFunction<T extends 'a' | 'b' | 'c', R = NameToType<T>>(
  ...names: T[]
): R[] {
  return names.map((shortName) => {
    const whatever = new Lookup[shortName]() as R;
    someDependentFunctionThatUsesGeneric(whatever);
    return whatever;
  });
}

In general, I think it would be useful to have some form of conditional types, beyond what we get via overloads and single dispatch. The rigidity of this, in its current form, is the single most painful part of adding types to python code.

akpircher commented 1 year ago

Would something along the lines of this be completely out of the question?

from __future__ import annotations

from typing import TypeVar

from typing_extensions import Literal, TypedDict

class A: pass
class B: pass
class C: pass

_T = TypeVar('_T', bound=Literal['a', 'b', 'c'])

class TypeMap(TypedDict):
    a: type[A]
    b: type[B]
    c: type[C]

TYPE_MAPPING: TypeMap = {'a': A, 'b': B, 'c': C}

def map_type(name: _T) -> TypeMap[_T]:
    return TYPE_MAPPING[name]

reveal_type(map_type('a'))  # could reveal Type[A]
reveal_type(TYPE_MAPPING['a'])  # Type of "TYPE_MAPPING['a']" is "Type[A]"

def map_type2(*names: _T) -> tuple[TypeMap[_T], ...]:
    return tuple(TYPE_MAPPING[name] for name in sorted(set(names)))

reveal_type((TYPE_MAPPING['a'], TYPE_MAPPING['b']))  # Tuple[Type[A], Type[B]]
reveal_type(map_type2('a', 'b'))  # should be the same as above

bwo commented 1 year ago

I don't think the suggested approach (using a lookup table) would compose well with existing type features. Keep in mind that static type checkers don't actually execute code. They are designed to evaluate the type of an expression, not the value of an expression. A lookup table is an expression whose value must be understood to be of any utility.

The lookup table doesn't need to be evaluate (or its value understood) to be of utility, if its uses are suitably constrained.

Imagine you had a haskell type like this:

data Linter a c = forall b. Linter { context :: a -> Maybe b, test :: b -> Maybe c }

The b is an existential type; for any given Linter a c value, the b might be different. This means that when you use a value of such a type, you aren't allowed to let the b escape. You use it like this (assume values x :: a and linter :: Linter a c are in scope):

case linter of
    Linter context test -> context x >>= test

The typechecker knows that, whatever the b in question is, the overall type of the expression is Maybe c.

Similarly, if you wanted a mapping of types to handlers, you would have to ensure that the handler can't escape a local context:

# hypothetical future python
T = TypeVar('T')
dispatch : dict[Type[T], Callable[[T], int) = {
   str: len,
   int: lambda x: x
   # more cases
}

def handle(v: T) -> int:
    handler = dispatch.get(type(v))
    if handler is not None:
        return handler(v)
    return 0

# error
def bad(v: T) -> Optional[???]:
     return dispatch.get(type(v))

To type bad you need to know what the type of v is. But to type handle, it seems that all you need to know is that whatever the type of v is, the type of dispatch.get(type(v)) is Optional[Callable[[T], int]]. As long as the result of looking something up in the map is used only to be called with the value whose type was the lookup key, you should be able to type it confidently (even if you can't type the retrieved value itself).

In theory you could even limit something like this to being used inside a match/case statement to ensure it's precisely scoped, I dunno.

v-- commented 2 months ago

Based on this comment on another issue, I found that using __getitem__ on the metaclass (rather than __class_getitem__ on the class) can be of some assistance.

The following example, which tries to scratch the same itch as in the numpy example above, works with mypy 1.11.1 (the implementation of __getitem__ is for the runtime - it is mostly irrelevant):

from typing import overload, assert_type

class ArithmeticPromotionMeta(type):
    @overload
    def __getitem__(cls, types: tuple[type[int], type[int]]) -> type[int]: ...
    @overload
    def __getitem__(cls, types: tuple[type[float], type[float]]) -> type[float]: ...
    @overload
    def __getitem__(cls, types: tuple[type[complex], type[complex]]) -> type[complex]: ...
    def __getitem__(cls, types: tuple[type, type]) -> type:
        a, b = types

        if a is complex or b is complex:
            return complex

        if a is float or b is float:
            return float

        return int

class ArithmeticPromotion(metaclass=ArithmeticPromotionMeta):
    pass

assert_type(ArithmeticPromotion[int, int], type[int])
assert_type(ArithmeticPromotion[float, int], type[float])
assert_type(ArithmeticPromotion[complex, float], type[complex])

This code cannot be used with type variables, however, which makes it more of a curiosity than a solution (for me at least): Screenshot_20240821_194610