python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.59k stars 233 forks source link

Create PEP for Map with bind: match Tuple of parameterized types with parameters from TypeVarTuple #1383

Open jmsmdy opened 1 year ago

jmsmdy commented 1 year ago

Prior work in PEP-646 on Map was dropped to simplify it, with a promise it would appear in a future PEP. Here is a reference to the dropped Map feature: https://github.com/python/typing/issues/193#issuecomment-782689251

Here is an discussion of a potential usage of this Map feature for matching argument types for a length-n Tuple of unary Callables with a length-n Tuple parameterized by a TypeVarTuple: https://groups.google.com/g/dev-python/c/SbPOxIEvI60?pli=1

Here is a draft spec of PEP-646 that included discussion of Map: https://github.com/python/peps/blob/bf897f8c839d1b4d4534ab1fa223a210e2cacf06/pep-0646.rst

I would like to propose an extension of Map with a "bind" argument to allow disambiguation in case the parameterized type has multiple generic parameters. To give an example:

###########
## Converting from multiple domains to a common codomain
############

Domain = TypeVar('Domain')
Codomain = TypeVar('Codomain')

class Detector(Generic(Domain)):
    def __init__(self, detector_fn: Callable[[Any], Optional[Domain]):
        self.detector_fn = detector_fn

    def detect(arg: Any) -> Optional[Domain]
         # returns `arg` if `arg` belongs to Domain, otherwise returns None
        return self.detector_fn(arg)

class Converter(Generic[Domain, Codomain]):
    def __init__(self, converter_fn: Callable[[Domain], Codomain]):
        self.converter_fn = converter_fn

    def convert(arg: Domain) -> Codomain:
        return self.converter_fn(arg)

Domains = TypeVarTuple('Domains')

class ConverterCollection(Generic[*Domains, Codomain]):
    def __init__(self, detectors: Tuple[*Map[Detector, Domains]], converters: Tuple[*Map[Converter, Domains, bind=Domain]]):
        self.detectors = detectors
        self.converters = converters

    def convert(self, obj_to_convert: Union[*Domains]) -> Codomain:
        for detector, converter in zip(self.detectors, self.converters):
            detected_object = detector.detect(obj_to_convert)
            if not (detected_object is None):
                return converter.convert(object_to_convert)
        raise ValueError('No converter for object.')

### Usage

int_detector = Detector[int](lambda x: x if isinstance(x, int) else None)
int_converter = Converter[int, str](lambda x: json.dumps(x))

float_detector = Detector[float](lambda x: x if isinstance(x, float) else None)
float_converter = Converter[float, str](lambda x: json.dumps(x))

cc = ConverterCollection[int, float, str](
    (int_detector, float_detector),
    (int_converter, float_converter)
)

cc.convert(2)    # works
cc.convert(2.0) # also works
cc.convert('hi')  # type error

The bind=Domain argument here would be because Converter class is generic in both Domain and Codomain, so there is ambiguity about which generic parameter to Map. This is not necessary for Detector because, because it is generic only over Domain, so there is no ambiguity what to map over.

erictraut commented 1 year ago

There are a couple of problems with your proposed solution.

First, keyword arguments are not allowed syntactically within an index expression. Support for this was proposed but unfortunately rejected in PEP 637.

Second, you've proposed associating the "bound" parameter with a reference to a TypeVarTuple that has no meaning in the context in which it's used. TypeVarTuple (like TypeVar and ParamSpec) must be used only within a context in which they are bound to a valid scope (a generic class, function or type alias). You are using the TypeVarTuple called Domain in a context where it has no valid meaning. Does it refer to the Domain scoped to the Detector class? Some other scope?

Scoping of TypeVars is admittedly confusing and poorly understood by many Python users. This is partly why we are pursuing PEP 695.

jmsmdy commented 1 year ago

First, keyword arguments are not allowed syntactically within an index expression. Support for this was proposed but unfortunately rejected in PEP 637.

I am not committed to that syntax. Mostly I would just like some kind of operation like Map (or similar) to allow expressing element-wise relationships between types in two related variadic tuples. The "bind=" here is because in this case Converter class has two parameters, and it is ambiguous which should be mapped over.

Second, you've proposed associating the "bound" parameter with a reference to a TypeVarTuple that has no meaning in the context in which it's used. TypeVarTuple (like TypeVar and ParamSpec) must be used only within a context in which they are bound to a valid scope (a generic class, function or type alias). You are using the TypeVarTuple called Domain in a context where it has no valid meaning. Does it refer to the Domain scoped to the Detector class? Some other scope?

It would refer unambiguously to the Domain scoped in the Converter class (the first argument to Map). The previously proposed Map[MyGeneric, MyTypeVarTuple] from the draft of PEP-646 would already implicitly pick out the unique free parameter in MyGeneric, assuming there is exactly one free parameter. I just want to extend this by allowing MyGeneric to be doubly-generic (two parameters) or triple-generic (three parameters) etc. and specify which free parameter in MyGeneric is to be used to map over the types in MyTypeVarTuple.

For example, the draft of PEP-646 would allow Map[List, MyTypeVarTuple], which whenMyTypeVarTuple is instantiated with Tuple[str, int, int, float] would give Tuple[list[str], list[int], list[int], list[float]].

But the draft of PEP-646 would not allow Map[Mapping, MyTypeVarTuple], because the generic Mapping type has two free parameters. It is ambiguous whether Map[Mapping[T,S], MyTypeVarTuple] would resolve to Tuple[Mapping[str, S], Mapping[int, S], Mapping[int, S], Mapping[float, S]] or Tuple[Mapping[T, str], Mapping[T, int], Mapping[T, int], Mapping[T, float] (here Mapping is just an example of a generic type with more than one parameter).

Scoping of TypeVars is admittedly confusing and poorly understood by many Python users. This is partly why we are pursuing PEP 695.

This looks interesting. I also understand the desire to avoid creating a Turing-complete / unsolvable type system, which limits what programmatic typing features can be safely added. Map would obviously add to the complexity somewhat, but I don't think it poses too much of a risk.

To give more philosophical motivation: it is already possible to type the example I gave above by something like:

class ConverterCollection:
    def __init__(self, detectors: list[Detector], converters: list[Converter]):
        self.detectors = detectors
        self.converters = converters

But now you have not constrained there to be the same number of detectors and converters, and you have not constrained the types for each detector to match with the type of its corresponding converter. I am really not dogmatic about the exact solution, but the general idea is that you would want to be able to impose constraints on variadic generic types in some kind of simple language (maybe a relational language). The role of "Map" is just to impose the above constraints on the types of detectors and converters. It could be filled by any other way of expressing that these two variadic generic types are to have the same length and such-and-such parameters should be related in such-and-such way. Another way of doing this might be to have an explicit language for type constraints, for example a syntax like this:

Ts = TypeVarTuple('Ts')
Detectors = TypeVarTuple('Detectors', constraint='Detectors[i] = Detector[Ts[i]]')
Converters = TypeVarTuple('Converters', constraint='Converters[i] = Converter[Ts[i], S]')

A generic type like SelfMapping = Mapping[T, T] already implicitly has a constraint from coreference, and could be expressed in such a language like this:

T = TypeVar('T')
S = TypeVar('S', constraint='S=T')
SelfMapping = Mapping[T, S]

So the idea is just to allow more sophisticated constraints than is allowed by coreference to the same TypeVar. I'm not proposing this (it is very ugly and has problems), just trying to illustrate that the typing feature I am interested in follows a kind of "multiply, then constrain" paradigm, where the two main operations for constructing new generic types are to make multiple new generic types (which are by default totally independent and unrelated), and then constrain those generic types by imposing relations on them. I believe this kind of paradigm is unlikely to create an unsolvable type system, as long as the constraints can be computed. I view Map as a more functional way of expressing the relation that "this parameter of this generic type should be equal to that parameter of this other generic type".

To focus the discussion (since I do not want to try to argue for a syntax change that has already been rejected):

  1. Is Map still planned for a future PEP (as it was circa 2021), or has it been officially reviewed and rejected?
  2. Are there existing proposals without syntax/binding hurdles which would be able to do something equivalent to the "multiply, then constrain" approach to defining generics, where the constraints are more complicated than simple type equality?
    • Would any be able to handle matching corresponding types across two variadic tuples via a more complicated condition than equality?
    • Would any be able to handle constraints like Converters[i] = Converter[Ts[i], S] where some but not all free parameters are subject to constraints (relational or otherwise)?

For the record, I don't have any great knowledge of type systems, nor do I have strong opinions on the right solution here. I opened this issue mostly to follow up with the seemingly forgotten promise of a future PEP for Map.

erictraut commented 1 year ago

Map was removed from PEP 646 because the spec was already very complex, and there were concerns that it would take type checkers a long time to implement it. Indeed, mypy still has not implemented full support for PEP 646. Until this happens, we will likely see little or no adoption of PEP 646 in any major library. Once mypy releases full support for PEP 646, we should start to see more adoption of variadic type vars, and the typing community will then be in a better position to assess the need (and prioritization) for further extensions like Map.

If individuals like you feel like there's a compelling need for Map, it might make sense to start discussing and formalizing the spec for such a facility in the form of a proto-PEP.

mikeshardmind commented 1 year ago

A motivating use case, without full detail here, if it helps to have simple motivating examples supporting this down the line.

class SerializationWrapper[*Ts]:
    def __init__(
        self, types: Map[Type, *Ts],
        additional_encoding_hooks: Callable | None = None,
        additional_decoding_hooks: Callable | None = None,
    ):
        self._types: Map[Type, *Ts] = types
        # create stateful en/decoders, register these types, and raise if the types aren't understood or
        # if the types have ambiguous overlap from the perspective of encoding or decoding
        # after the registration of optional hooks

    def encode(obj: Union[*Ts]) -> bytes:
        ...

    def decode(bytes) -> Union[*Ts]:
        ...
Be3y4uu-K0T commented 1 year ago

I would suggest using an unpacked TypeVarTuple with subsequent unpacking GenericAlias (or special class for this PackedGenericAlias):

class SerializationWrapper[*Ts]:
    def __init__(
        self, types: *type[Ts],
        additional_encoding_hooks: Callable | None = None,
        additional_decoding_hooks: Callable | None = None,
    ):
        self._types: *type[Ts] = types
        # create stateful en/decoders, register these types, and raise if the types aren't understood or
        # if the types have ambiguous overlap from the perspective of encoding or decoding
        # after the registration of optional hooks

    def encode(obj: *type[Ts]) -> bytes:
        ...

    def decode(bytes) -> Union[*Ts]:
        ...
insilications commented 1 year ago

If individuals like you feel like there's a compelling need for Map, it might make sense to start discussing and formalizing the spec for such a facility in the form of a proto-PEP.

It was sad to see Map operation removed from PEP 646 (but understandable). Python is not my "main language", but when I do I prefer typing it and I've seen the need for such an operation.