dbrattli / Expression

Functional programming for Python
https://expression.readthedocs.io
MIT License
475 stars 31 forks source link

Add curry decorator that don't loose the typing #49

Closed dbrattli closed 2 years ago

dbrattli commented 2 years ago

This may fix the problem mentioned in #48

codecov[bot] commented 2 years ago

Codecov Report

Merging #49 (eefdd8f) into main (2725ebf) will increase coverage by 0.25%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
+ Coverage   80.00%   80.25%   +0.25%     
==========================================
  Files          38       38              
  Lines        2255     2284      +29     
==========================================
+ Hits         1804     1833      +29     
  Misses        451      451              
Impacted Files Coverage Δ
expression/__init__.py 100.00% <ø> (ø)
expression/core/__init__.py 100.00% <100.00%> (ø)
expression/core/curry.py 100.00% <100.00%> (ø)
expression/collections/frozenlist.py 77.34% <0.00%> (+0.08%) :arrow_up:
expression/collections/seq.py 81.64% <0.00%> (+0.27%) :arrow_up:
expression/collections/asyncseq.py 70.96% <0.00%> (+0.47%) :arrow_up:
jim108dev commented 2 years ago

Hi! I have experienced problems with literals:

@curry1of3
def f1(a: int,b,c):
    return c

ColumnType = Literal["order_id"]
@curry1of3
def f2(a: ColumnType,b,c):
    return c

f1 works. Error at @curry1of3:

Argument of type "(a: ColumnType, b: Unknown, c: Unknown) -> Unknown" cannot be assigned to parameter "fn" of type "(A@curry1of3, B@curry1of3, C@curry1of3) -> D@curry1of3" in function "curry1of3"
  Type "(a: ColumnType, b: Unknown, c: Unknown) -> Unknown" cannot be assigned to type "(A@curry1of3, B@curry1of3, C@curry1of3) -> D@curry1of3"
    Parameter 1: type "A@curry1of3" cannot be assigned to type "ColumnType"
      "ColumnType" cannot be assigned to type "ColumnType"
dbrattli commented 2 years ago

@jim108dev The joy of Python type checking: "ColumnType" cannot be assigned to type "ColumnType". We should probably check with Pyright team if this is a bug or by design.

jim108dev commented 2 years ago

ok, I guess it is the same issue as

https://github.com/microsoft/pyright/issues/1703

dbrattli commented 2 years ago

@jim108dev Interesting, thanks for sharing!

Hugovdberg commented 2 years ago

It is actually possible to improve the type hints for the basic curried function using PEP 612. I have a very basic implementation that could probably be merged with the curried function by using the recursive trick. By using Concatenate[_a, _P] you specify a function that takes an argument of type _a and zero or more remaining arguments using the ParamSpec("_P"). The nice thing is the wrapper can refer to the remaining arguments by (*args: _P.args, **kwargs: _P.kwargs), so the type checker knows the parameters of the decorated function.

from typing import Callable, TypeVar
from typing_extensions import Concatenate, ParamSpec # This is only in typing from Python 3.10

_P = ParamSpec("_P")

_a = TypeVar("_a")
_b = TypeVar("_b")

def curry(fun: Callable[Concatenate[_a, _P], _b]) -> Callable[[_a], Callable[_P, _b]]:
    @functools.wraps(fun)
    def wrapper(x: _a) -> Callable[_P, _b]:
        @functools.wraps(fun)
        def _wrapper(*args: _P.args, **kwargs: _P.kwargs) -> _b:
            return fun(x, *args, **kwargs)

        return _wrapper

    return wrapper

I haven't tried the recursive thing yet, but I expect we would need to do the same overloads on the signatures as for for example pipe and compose to explicitly tell it's a callable of type

Callable[
    [Callable[[_A, _B, _C, _D, _E, _F, _G, _H, _T], _J]],
    Callable[
        [_A],
        Callable[
            [_B],
            Callable[
                [_C],
                Callable[
                    [_D],
                    Callable[
                        [_E],
                        Callable[
                            [_F], Callable[[_G], Callable[[_H], Callable[[_T], _J]]]
                        ],
                    ],
                ],
            ],
        ],
    ],
]
dbrattli commented 2 years ago

@Hugovdberg This looks very interesting. Thanks for posting. I'll have a closer look this weekend.

Hugovdberg commented 2 years ago

@dbrattli there are some drawbacks with this approach though. First of all, ParamSpec and Concatenate are only available from Python 3.10, but they are backported in the typing_extensions package, but that would introduce a new dependency for this package. However, that is a very lightweight dependency which is probably already installed by most users as a dependency of other packages.

Also, according to that type specification, if you have a function def foo(a,b,c,d,e,f,g)->h, you MUST call it as foo(a)(b)(c)(d)(e)(f)(g), and makes all optional parameters explicit. So perhaps it is better suited for more restricted implementations such as the curry#of# functions you already have. Perhaps it is best to provide two ways of currying: 1) curry: flexible but with loss of static type checking, 2) curry1,curry2 etc, that allow you to curry the first N parameters.

Another trick you can do with the ParamSpec by the way is a sort of reverse curry (for lack of a better name):

def curry2(fun: Callable[[_A, _B], _C]) -> Callable[[_A], Callable[[_B], _C]]:
    return curried(fun)

def rev_curry(
    fun: Callable[Concatenate[_A, _P], _B]
) -> Callable[_P, Callable[[_A], _B]]:
    def _wrap_args(*args: _P.args, **kwargs: _P.kwargs) -> Callable[[_A], _B]:
        def _wrap_a(x: _A) -> _B:
            return fun(x, *args, **kwargs)

        return _wrap_a

    return _wrap_args

def rev_curry2(
    fun: Callable[Concatenate[_A, _B, _P], _C]
) -> Callable[_P, Callable[[_A], Callable[[_B], _C]]]:
    def _wrap_args(
        *args: _P.args, **kwargs: _P.kwargs
    ) -> Callable[[_A], Callable[[_B], _C]]:
        @curry2
        def _wrap_a(x: _A, y: _B) -> _C:
            return fun(x, y, *args, **kwargs)

        return _wrap_a

    return _wrap_args

This allows you to "bake-in" the optional arguments to a function, leaving you with a curried function that only needs the first N arguments, preserving the optional status of those arguments. A nice application in which I use this is for joining DataFrames in a pipeline:

merge = rev_curry2(pd.DataFrame.merge)

inner_join_on_name = merge(how="inner", on="name")
# inner_join_on_name: (DataFrame) -> ((DataFrame | Series[S1@merge]) -> DataFrame)
dbrattli commented 2 years ago

Thanks @Hugovdberg , yes I just discovered that it wasn't that easy. Ended up trying to make different ParamSpec overloads, but could not make it work so you are right that the curried function is too generic. A more restricted version could perhaps work. I need to look at your last comment and try some more ...

@overload
def curried(
    fn: Callable[Concatenate[_A, _B, _C, _P], _D]
) -> Callable[[_A, _B, _C], Callable[_P, _D]]:
    ...

@overload
def curried(
    fn: Callable[Concatenate[_A, _B, _P], _C]
) -> Callable[[_A, _B], Callable[_P, _C]]:
    ...

@overload
def curried(fn: Callable[Concatenate[_A, _P], _B]) -> Callable[[_A], Callable[_P, _B]]:
    ...

@overload
def curried(fn: Callable[Concatenate[_P], _A]) -> Callable[[], Callable[_P, _A]]:
    ...
dbrattli commented 2 years ago

Another idea is to make the arity explicit so we can build overloads targeting each of them e.g:

@overload
def curry(arity: Literal[1]) -> Callable[[Callable[_P, _B]], Callable[_P, _B]]:
    ...

@overload
def curry(
    arity: Literal[2],
) -> Callable[[Callable[Concatenate[_A, _P], _B]], Callable[[_A], Callable[_P, _B]]]:
    ...

@overload
def curry(
    arity: Literal[3],
) -> Callable[
    [Callable[Concatenate[_A, _B, _P], _C]],
    Callable[[_A], Callable[[_B], Callable[_P, _C]]],
]:
    ...

@overload
def curry(
    arity: Literal[4],
) -> Callable[
    [Callable[Concatenate[_A, _B, _C, _P], _D]],
    Callable[[_A], Callable[[_B], Callable[[_C], Callable[_P, _D]]]],
]:
    ...

def curry(arity: int) -> Callable[..., Any]:
    def _curry(
        args: Tuple[Any, ...], arity: int, fn: Callable[..., Any]
    ) -> Callable[..., Any]:
        def wrapper(*arg: Any) -> Any:
            if arity == 1:
                return fn(*args, *arg)
            return _curry(args + arg, arity - 1, fn)

        return wrapper

    def wrapper(fn: Callable[..., Any]) -> Callable[..., Any]:
        return _curry((), arity, fn)

    return wrapper

This one is recursive so you curry as many args as you want, but static type checking for the number of overloads we write:

def test_curry3of3():
    @curry(3)
    def add(a: int, b: int, c: int) -> int:
        """Add a + b + c"""
        return a + b + c

    assert add(3)(4)(2) == 9

... and works with optional arguments e.g:

def test_curry2of3_with_optional():
    @curry(2)
    def add(a: int, b: int, c: int = 10) -> int:
        """Add a + b + c"""
        return a + b + c

    assert add(3)(4) == 17

def test_curry2of3_with_optional2():
    @curry(2)
    def add(a: int, b: int, c: int = 10) -> int:
        """Add a + b + c"""
        return a + b + c

    assert add(3)(4, c=9) == 16
Hugovdberg commented 2 years ago

I like the idea of explicit arity, although I first expected curry(n) to extract the first n parameters as single parameters, and then return a function that needs to be called once more with the remaining arguments. But given the definition of arity this makes no sense, and the current implementation seems the best option. Essentially curry(1) is a convoluted way of describing the identity function, as by default python functions are unary, taking a single tuple of arguments. It might be best to explicitly document this with some explanation.

This parametric currying is quite also nice to use in the rev_curry, as it can simply call _curry(n) to curry the first n parameters (the _curry function doesn't use any parameters from the closure of curry so could just as well be defined globally). Note that I changed the type hint of the arity parameter to _Arity, to make sure it is only called with one of the overloaded number of arguments when using static typing.

_Arity = Literal[2, 3, 4]

def _curry(
    args: Tuple[Any, ...], arity: int, fn: Callable[..., Any]
) -> Callable[[Any], Any]:
    def wrapper(*arg: Any):
        if arity == 1:
            return fn(*args, *arg)
        return _curry(args + arg, arity - 1, fn)

    return wrapper

def curry(arity: _Arity) -> Callable[..., Any]:
    def wrapper(fn: Callable[..., Any]) -> Callable[..., Any]:
        return _curry((), arity, fn)

    return wrapper

def rev_curry(arity: _Arity) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
    def _wrap_fun(fun: Callable[..., Any]) -> Callable[..., Any]:
        def _wrap_args(*args, **kwargs) -> Callable[..., Any]:
            def _wrap_curried(*curry_args) -> Any:
                return fun(*curry_args, *args, **kwargs)

            return _curry((), arity - 1, _wrap_curried)

        return _wrap_args

    return _wrap_fun
dbrattli commented 2 years ago

This looks really nice!! I btw changed the outer curry to take the number of args to curry instead of arity in the PR. I should perhaps call args for num_args instead to make it very explicit. This means that curried(0) now is the identity function which makes more sense. All take a closer look at your new rev_curry ... 👀

@overload
def curried(args: Literal[0]) -> Callable[[Callable[_P, _B]], Callable[_P, _B]]:
    ...

@overload
def curried(
    args: Literal[1],
) -> Callable[[Callable[Concatenate[_A, _P], _B]], Callable[[_A], Callable[_P, _B]]]:
    ...

@overload
def curried(
    args: Literal[2],
) -> Callable[
    [Callable[Concatenate[_A, _B, _P], _C]],
    Callable[[_A], Callable[[_B], Callable[_P, _C]]],
]:
    ...

@overload
def curried(
    args: Literal[3],
) -> Callable[
    [Callable[Concatenate[_A, _B, _C, _P], _D]],
    Callable[[_A], Callable[[_B], Callable[[_C], Callable[_P, _D]]]],
]:
    ...

@overload
def curried(
    args: Literal[4],
) -> Callable[
    [Callable[Concatenate[_A, _B, _C, _D, _P], _E]],
    Callable[[_A], Callable[[_B], Callable[[_C], Callable[[_D], Callable[_P, _E]]]]],
]:
    ...

def curried(args: int) -> Callable[..., Any]:
    """A curry decorator.

    Makes a function curried.

    Args:
        args: The number of args to curry from the start of the function

    Example:
        @curried(1)
        def add(a: int, b: int) -> int:
            return a + b

        assert add(3)(4) == 7
    """

    def _curry(
        args: Tuple[Any, ...], arity: int, fn: Callable[..., Any]
    ) -> Callable[..., Any]:
        def wrapper(*arg: Any, **kw: Any) -> Any:
            if arity == 1:
                return fn(*args, *arg, **kw)
            return _curry(args + arg, arity - 1, fn)

        return wrapper

    def wrapper(fn: Callable[..., Any]) -> Callable[..., Any]:
        return _curry((), args + 1, fn)

    return wrapper
Hugovdberg commented 2 years ago

A nice sideeffect of the explicit type hints is also that the following raises a type error as the decorated function has too few arguments:

@curried(3)
def add(x, y):
    return x + y

This does work, which might be a little unexpected, but it just returns a nullary function

@curried(2)
def add(x, y):
    return x + y

<Edited, nevermind, was using the old definition of _curry, I see you already fixed that...>

Hugovdberg commented 2 years ago

I was just thinking that rev_curry has some similarity with flip, but then on groups of parameters. It moves the first n parameters to the end and then curries those n parameters. Perhaps that helps to discover a better name 😉

dbrattli commented 2 years ago

PS: The inspiration btw came from Fable Python (F# transpiler) combined with your suggestions for type annotations. I'll rewrite that code similarly.

dbrattli commented 2 years ago

So with rev_curry I can finally write functions like this as a non-nested function!? Just have to put source first and then use rev_curry to move the first arg last (as a curried arg). That would be amazing!!:

dbrattli commented 2 years ago

PS: Updated PR. Added rev_curry as curry_flipped. I like that name!

Hugovdberg commented 2 years ago

Yes, you wouldn't need to write it as a nested function. Although just using curry would work just fine in this case. rev_curry is especially useful for functions with many optional arguments.

dbrattli commented 2 years ago

Yes, you are right. But often there can be several optional arguments that may block the "source" from being the last arg e.g https://github.com/ReactiveX/RxPY/blob/modern-typehints/rx/core/operators/firstordefault.py#L12

dbrattli commented 2 years ago

FYI: @jim108dev https://github.com/microsoft/pyright/issues/3086#issuecomment-1046261020

Hugovdberg commented 2 years ago

I have used both the curry and curry_flipped decorators in some minor projects, and they work beautifully to create some concise and expressive code! Combined with pipe it's sometimes almost poetic 😀

dbrattli commented 2 years ago

Thanks for the feedback @Hugovdberg . Then I think I will merge this, and we can fix issues in separate PRs if any.