python / typing

Python static typing home. Hosts the documentation and a user help forum.
https://typing.readthedocs.io/
Other
1.59k stars 233 forks source link

ParamSpec: use P.args and P.kwargs in other scopes, such as return types #1252

Open chadrik opened 2 years ago

chadrik commented 2 years ago

I have a number of cases where it would be very useful to use P.args and P.kwargs of a ParamSpec to annotate tuple and dict objects, for example, when extracting arguments to pass to a function in another context.

Here's an example:

from typing import Callable, TypeVar, Tuple, Any
from typing_extensions import ParamSpec

P = ParamSpec('P')
T = TypeVar('T')

def complex(value: str, reverse: bool =False, capitalize: bool =False) -> str:
    if reverse:
        value = str(reversed(value))
    if capitalize:
        value = value.capitalize()
    return value

def call_it(func: Callable[P, T], args: P.args, kwargs: P.kwargs) -> T:
    print("calling", func)
    return func(*args, **kwargs)

def get_callable_and_args() -> Tuple[Callable[P, Any], P.args, P.kwargs]:
    return complex, ('foo',), {'reverse': True}

call_it(*get_callable_and_args())

In this scenario P.args and P.kwargs represent a kind of ad-hoc NamedTuple and TypedDict, respectively.

Does this seem like a reasonable extension for ParamSpecs?

sobolevn commented 2 years ago

I remember seing patterns like that in typeshed/stdlib. Finding these examples might be helpful to prove your point 👍

grievejia commented 2 years ago

In this scenario P.args and P.kwargs represent a kind of ad-hoc NamedTuple and TypedDict, respectively.

This option has already been discussed and rejected in PEP 612. To quote from the discussion:

The core problem here is that, by default, parameters in Python can either be called positionally or as a keyword argument. This means we really have three categories (positional-only, positional-or-keyword, keyword-only) we’re trying to jam into two categories...

As an example, for the following code:

def foo(x: int) -> None: ...
def get_callable_and_args(x: Callable[P, Any]) -> Tuple[P.args, P.kwargs]: ...

reveal_type(get_callable_and_args(foo))  # <- What should this reveal?

Should we reveal a tuple of int + empty dict? Or should we reveal an empty tuple and a singleton dict that maps x to int? Even for this extremely simplified example, the's no definite answer here: the existence of "positional-or-keyword" arguments in Python makes things highly ambiguous. And to have the type checker second-guessing the intention of the developer here would be rather expensive (think about the combinatorial blowup if you a long list of arguments).

To fully resolve this kind of issue, there needs to be a way in the type system to represent positional-only, positional-or-keyword, and keyword-only parameters separately. But at that point, it's going to be a different language feature, not ParamSpec anymore.

Kenny2github commented 2 years ago

This came up in the context of ORMs - e.g. something vaguely like

T = TypeVar('T', bound=type)
P = ParamSpec('P')
# Attribute[T] = table field of type T
def find(table: T, selection: Any, *projection: Map[Attribute, P.args]) -> Map[tuple, P.args]:
    # SELECT projection FROM table WHERE selection
    return # values from the selection, with same types as the attributes
# e.g.
from attrs import define
@define
class Strings:
    __collection__ = 'strings'
    id: int
    string: str
# Strings.id is an Attribute[int], etc
result: tuple[int, str] = find(Strings, ..., Strings.id, Strings.string) # should typecheck
grievejia commented 2 years ago

@Kenny2github That does not work because, it's invalid to use P.args in isolation. To quote from the PEP:

Furthermore, because the default kind of parameter in Python ((x: int)) may be addressed both positionally and through its name, two valid invocations of a (*args: P.args, **kwargs: P.kwargs) function may give different partitions of the same set of parameters. Therefore, we need to make sure that these special types are only brought into the world together, and are used together, so that our usage is valid for all possible partitions.

So again the issue is that there's no unambiguous way to split a parameter list cleanly into a P.args list and P.kwargs dict. Therefore we enforce the restriction that those two guys must be used together so the type system does not need to deal with such unambiguity.

The use case you mentioned is better handled by the (yet-to-be-pepified) "map" operator extension for list variadics.

chadrik commented 2 years ago

So again the issue is that there's no unambiguous way to split a parameter list cleanly into a P.args list and P.kwargs dict.

So then is it solvable if we instead return an object that represents the entire set of arguments, such as a tuple that holds both the args and the kwargs? That object could then be considered P itself when used outside of a Callable.

For example:

def call_it(func: Callable[P, T], args: P) -> T:
    print("calling", func)
    return func(*args[0], **args[1])

def get_callable_and_args() -> Tuple[Callable[P, Any], P]:
    return complex, (('foo',), {'reverse': True})
grievejia commented 2 years ago

@chadrik That won't work either: P does not work like regular typevars where it represents a single type. It instead represents a "parameter group" that can be passed into other callables, and there's no way in Python to construct the notion it represents at runtime. Therefore, it does not make sense to say something like "the return type of my function is P". Even if it does, there's no way you could write it, as there could potentially be many, many ways P gets split into pos+keyword arglist, and you can't just return one particular split and pretend that it will work for any split downstream.

chadrik commented 2 years ago

What if we create a new type of object which represents "parameters that are compatible with ParamSpec P"?

For example:

P = ParamSpec('P')
ParamsP = Parameters(P)

def call_it(func: Callable[P, T], args: ParamsP) -> T:
    print("calling", func)
    return func(*args.args, **args.kwargs)

def get_callable_and_args() -> Tuple[Callable[P, Any], ParamsP]:
    return complex, ParamsP(('foo',), {'reverse': True})

func, args = get_callable_and_args()
reveal_type(args.args)  # prints: Tuple[str, ...]
reveal_type(args.kwargs) # prints: Dict[str, Any]
call_it(func, args)

there could potentially be many, many ways P gets split into pos+keyword arglist

In the above example, it doesn't matter that we don't know all of the ways to split apart P, as long as we can vet that ParamsP is one of those ways when it is instantiated, which seems doable to me.

ParamsP is essentially just a NamedTuple that amounts to:

class ParamsP(NamedTuple):
    args: Tuple[str, ...]
    kwargs:  Dict[str, Any]

There is no inference or guarantees about the structure of ParamsP, and from a static analysis POV, it would only be valid to use star and double-star expansion with ParamsP.args and ParamsP.kwargs if both are used together.

grievejia commented 2 years ago

it doesn't matter that we don't know all of the ways to split apart P, as long as we can vet that ParamsP is one of those ways when it is instantiated

That only works if your function is a "consumer" of ParamsP, i.e. it takes ParamsP as argument. But if you are a "producer" of ParamsP, i.e. you are returning a ParamsP object from your function, you'd need to make sure that what you returns works for any positional/keyword splits of the parameter list, not just one. I am not aware of how you could construct such an object at runtime. What you proposed in your earlier example definitely does not satisfy this criteria.

chadrik commented 1 year ago

I've been thinking lately that perhaps the best way to pass around a function and its arguments prior to invoking it is a partial. Unfortunately, mypy's partial support is extremely lacking: https://github.com/python/mypy/issues/1484

The returns package provides a mypy plugin that adds full partial validation, but unfortunately I tested it with mypy 0.982 and it is broken.

sobolevn commented 1 year ago

Yes, we don't have mypy>0.950 support yet, but it is planned.

chadrik commented 1 year ago

Here's the ticket for the returns mypy plugin error for anyone coming across this conversation in the future: https://github.com/dry-python/returns/issues/1433

rmorshea commented 1 year ago

I would find something like this to be very useful. I'm am creating a react-like framework in Python and one thing I want is to be able to create a decorator that marks functions as HTML-like element constructors. The naive approach is to use *args to describe element children and **kwargs as the attributes. Doing so has unfortunate syntactic consequences though:

div(
    child,
    div(
        child,
        child,
        child,
        child,
    ),
    child,
    child,
    child,
    child,
    child,
    child,
    ...,
    # props for the outer-most element live all the way down here
    id="something",
)

Instead, it would be better if the following could be achieved:

@component
def div(child1: str, child2: str, *, attr1: int, attr2 int) -> str:
    ...

div({"attr1": 1, "attr2": 2}, "hello", "world")

If this feature were available, it seems like I could write component as:

P = ParamSpec("P")
R = TypeVar("R")

def component(func: Callable[P, R]) -> ElementConstructor[P, R]
    ...

R_co = TypeVar("R_co", covarient=True)

class ElementConstructor(Protocol[P, R]):
    def __init__(attributes: P.kwargs, *children: P.args) -> R:
        ...
henribru commented 1 year ago

This pattern often comes up in task queue libraries. Huey, Celery, Dramatiq and Rq all have a way of turning a function into a task where you call the task using args passed as a tuple and kwargs passed as a dict, instead of them being unpacked, similar to the call_it definition in the original comment.

Edit: threading.Thread from the stdlib is another example of this pattern actually.

rmorshea commented 1 year ago

I'm realizing that positional or keyword parameters would likely make implementing this rather complicated. The fact that a parameter can be either positional or a keyword, but not both, means that ParamSpec.args and ParamSpec.kwargs are linked - if a parameter is specified in one, it must be disallowed in the other. This must happen magically at a distance. For example:

P = ParamSpec("P")
R = TypeVar("R")

def split_args_kwargs(*args: P.args, **kwargs: P.kwargs) -> Callable[[P.args], Callable[[P.kwargs], R]]:
    ...

def f(a: int, b: int, c: int) -> int:
    ...

g = split_args_kwargs(f)

h = g(1)
h({"a": 1, "b": 2, "c": 3})  # error

It's possible that this is not a significant technical challenge, but I don't know enough about how MyPy works to say.