pytoolz / toolz

A functional standard library for Python.
http://toolz.readthedocs.org/
Other
4.7k stars 263 forks source link

Idea: compose as an operator #523

Open samfrances opened 3 years ago

samfrances commented 3 years ago

I wanted to put forward an idea - could functoolz benefit from a compose "operator"? The API I have in mind is the following:

@composable
def foo(x):
    # whatever

def bar(z):
    # does something

baz = foo | baz   # equivalent for baz = compose(foo, baz)

One benefit to this would be that I think this would be easier to add types for. (See: https://github.com/python/mypy/issues/8449)

If this meets with approval, I would be happy to attempt a PR.

samfrances commented 3 years ago

I have a working typescript version of this (with left-to-right composition). Something analogous would be possible in typed Python, but with the added bonus of decorators and operator overloading to clean up the syntax.

Here's the typescript version:

class Pipe<In extends readonly any[], Out> {
    private constructor(public readonly fn: (...args: In) => Out) {}

    to<Result>(fn: (value: Out) => Result): Pipe<In, Result> {
        return new Pipe<In, Result>((...args: In) => fn(this.fn(...args)));
    }

    static from<In extends readonly any[], Out>(fn: (...args: In) => Out) {
        return new Pipe((...args: In) => fn(...args));
    }
}

const foo =
    Pipe.from((x: number) => x * 2)
        .to(x => x.toString())
        .to(s => s + s)
        .fn

Link

cardoso-neto commented 3 years ago

I tried, but I couldn't get the type inference to work at all. Even limiting my functions to always receive a single parameter... Either I'm too stupid or the type hinting system is just not there yet.

X = TypeVar('X')
Y = TypeVar('Y')
Z = TypeVar('Z')

class composable:
    """
    Decorator to compose functions with the | operator.
    """
    def __init__(self, _func: Callable[[X], Y]):
        self._func: Callable[[X], Y] = _func

    def __call__(self, arg: X) -> Y:
        return self._func(arg)

    def __or__(
        self: Callable[[X], Y], other: Callable[[Y], Z]
    ) -> Callable[[X], Z]:
        def composed(arg: X) -> Z:
            return other(self._func(arg))
        return composable(composed)

@composable
def to_int(string_: str) -> int:
    return int(string_)

@composable
def to_str(int_: int) -> str:
    return str(int_)

new_func = to_int | to_str
# expected (int_: int) -> string_: str
# result (_p0: X@__call__) -> Y@__call__

lol

ruancomelli commented 3 years ago

@cardoso-neto you can make this work by turning composable into a Generic type:

from __future__ import annotations # I added this line
# this import allows us to write composable[X, Z] (without quotes)
# instead of "composable[X, Z]" (with quotes)
# as the return type from composable.__or__

from typing import TypeVar, Callable, Generic

X = TypeVar('X')
Y = TypeVar('Y')
Z = TypeVar('Z')

class composable(Generic[X, Y]):
    """
    Decorator to compose functions with the | operator.
    """
    def __init__(self, _func: Callable[[X], Y]):
        self._func: Callable[[X], Y] = _func

    def __call__(self, arg: X) -> Y:
        return self._func(arg)

    def __or__(self, other: Callable[[Y], Z]) -> composable[X, Z]: # I changed this line
        def composed(arg: X) -> Z:
            return other(self._func(arg))
        return composable(composed)

@composable
def float_to_int(float_: float) -> int:
    return int(float_)

@composable
def int_to_str(int_: int) -> str:
    return str(int_)

float_to_str = float_to_int | int_to_str
# float_to_int: composable[float, int]
# int_to_str: composable[int, str]
# float_to_str: composable[float, str]
cardoso-neto commented 3 years ago

Epic. You fixed it. Ok, now type inference works through the whole chain. But do you know of a way to make it so we don't lose the ability to use N-parameter functions, i.e., variadic generic types? I'm imagining we'd need python to support something like Callable[[*X], Y] so we could then unpack X on the type hint of the return of __or__. Another thing we lost was the documentation (__doc__, __annotations__, etc.) of all decorated functions as well as the composed functions, but I assume a workaround for that to be readily available in the stdlib with something like functools.wraps for callable classes.

ruancomelli commented 3 years ago

@cardoso-neto I think that the only way to preserve the entire function type structure is to use typing.ParamSpecs, like this:

from __future__ import annotations

from typing import Callable, Generic, TypeVar
from typing_extensions import ParamSpec

Y = TypeVar('Y')
Z = TypeVar('Z')
P = ParamSpec('P')

class composable(Generic[P, Y]):
    """
    Decorator to compose functions with the | operator.
    """
    def __init__(self, _func: Callable[P, Y]):
        self._func: Callable[P, Y] = _func

    def __call__(self, *args: P.args, **kwargs: P.kwargs) -> Y:
        return self._func(*args, **kwargs)

    def __or__(self, other: Callable[[Y], Z]) -> composable[P, Z]:
        def composed(*args: P.args, **kwargs: P.kwargs) -> Z:
            return other(self(*args, **kwargs))
        return composable(composed)

@composable
def float_and_str_to_int(float_: float, str_: str) -> int:
    return int(float_ + float(str_))

@composable
def int_to_str(int_: int) -> str:
    return str(int_)

weird_func = float_and_str_to_int | int_to_str
# float_and_str_to_int: composable[(float, str), int]
# int_to_str: composable[(int), str]
# weird_func: composable[(float, str), str]

Note that ParamSpec will be available from the typing module in Python 3.10. Also, one limitation here is that only the left-most function can have more than one required parameter. The second function (int_to_str in this case) must be callable with a single argument - the return value from the first one (float_and_str_to_int).

As to the function documentation, I'm sure there are many solutions out there that will allow composable to capture the decorated function info when used as a decorator, though I have never done this myself. However, what are your ideas regarding the docstring for the composed functions? For instance:

@composable
def f(int_: int) -> str:
    '''Converts ints to strs'''
    return str(int_)

@composable
def g(str_: str) -> None:
    '''Prints strings'''
    print(g)

# the composable object copies the decorated function docstring:
f.__doc__ # "Converts ints to strs"
g.__doc__ # "Prints strings"

# however, what should we do with the composition?
(f | g).__doc__ # ??? what should we do here?
# __doc__ should *not* be "Converts ints to strs", because this is not what
# it does; nor should it be "Prints strings"

Perhaps a combination of both, like what is already done here in toolz? I would suggest using toolz.functoolz.compose here, because that is where the core functionality is:

from toolz.functoolz import compose

class composable(Generic[P, Y]):
    ...
    def __or__(self, other: Callable[[Y], Z]) -> composable[P, Z]:
        return compose(other, self)
cardoso-neto commented 3 years ago

one limitation here is that only the left-most function can have more than one required parameter The second function must be callable with a single argument the return value from the first one.

I've been thinking about this and, though it is a bit niche, if the left function returned a tuple we could have a separate operator that also unpacked its returns so they'd work on a right function that had more than a single required parameter. I don't feel like abusing the operator overloading more than we already do, so we could follow the stdlib and call it starcompose after their itertools.starmap (that's my excuse for not being able to figure out an intuitive operator for it).

Rough sketch:

...
def starcompose(self, other: Callable[[Y], Z]) -> composable[P, Z]:
    def composed(*args: P.args, **kwargs: P.kwargs) -> Z:
        return other(*self(*args, **kwargs))  # unpacking, splatting, starring, asterisking, discombobulating
    return composable(composed)
...

Or, more inline with toolz/funcy: # (snippet stolen from https://github.com/Suor/funcy/pull/62)

def unpack(func):
    @functools.wraps(func)
    def wrapper(arguments):
        return func(*arguments)
    return wrapper

And then

def starcompose(self, other: Callable[[Y], Z]) -> composable[P, Z]:
    return compose(unpack(other), self)

Though, we'd have to figure out types for that, I'm sure it wouldn't be a problem for you. And in similar fashion, we could have a doublestarcompose for kwargs that would do the same for functions that return a Mapping.

ruancomelli commented 2 years ago

@eriknw is there any interest in having this feature (I mean the @composable decorator) supported by toolz? If so, I can write a PR for that in the next few days.

eriknw commented 2 years ago

Hey, thanks for the ping and offer @ruancomelli! Sorry for my delay in seeing this (new job and email workflow; my filters have been fixed).

I think composability is theoretically interesting. And fun. I'm curious, though: how useful will it be in practice? When would you have used it?

What API do you suggest?

Yeah, I think there's enough interest in this to get it into toolz, so I say go for it!

Regarding typing: I think we'll want to add typing to toolz eventually, so it's worth considering typing-friendliness. The trend, however, is that the longer we wait to add typing to toolz (e.g., the minimum Python version we choose to support), the better the typing situation will be. We'll get there someday, I promise!

ruancomelli commented 2 years ago

Hi, @eriknw! I'm so sorry for the long delay. I guess I didn't have time to reply when I first saw it, and then I ended up completely forgetting about it :grimacing:

I think composability is theoretically interesting. And fun. I'm curious, though: how useful will it be in practice? When would you have used it?

I don't think I have ever needed this myself, it's easy enough to write compose(f, g). Maybe it's just a nice shortcut?

The big benefit I see for using the pipe operator is that it is possible to get full type-correctness given the current Python typing environment (we still need ParamSpec though), something I don't believe we can have with the variadic compose. To elaborate a bit, we can imagine (and implement) composable.__or__ as compose2, where

def compose2(f: Callable[P, R1], g: Callable[[R1], R2) -> Callable[P, R2]:
    def _composed(*args: P.args, **kwargs: P.kwargs) -> R2:
        return g(f(*args, **kwargs))
    return _composed

This function has all the type information we need. This also means that chaining multiple calls to compose2 (or the pipe operator) will also have very nice typing:

f: Callable[P, R1]
g: Callable[R1, R2]
h: Callable[R2, R3]

f | g # Callable[P, R2]
g | h # Callable[R1, R3]
f | g | h # Callable[P, R3]
g | f # Ooops, incompatible types!

In contrast, there is currently no way for us to annotate compose(*fs) in order for it to be this correct.

This leaves us with three ways for composing functions:

k = compose(f, g, h) # not type-correct in the general case
k = compose2(f, compose2(g, h)) # type-correct, but not ergonomic - imagine 3 or 5 functions?
k = f | g | h # type-correct and clean

What API do you suggest?

I propose adding a composable class to toolz with the pipe operator implemented as shown in previous comments, and making Compose inherit from it. This way, composable can be used as a decorator, and functions composed via compose benefit from it.

Might there be confusion around left-to-right or right-to-left processing of functions?

I don't think so - (f | g)(x) should be equivalent to g(f(x)). This is how piping works in general I guess.

What about compose_left? Should we upgrade that to a class and add __or__?

If we implement this the way I suggested, no further changes are required here since compose_left calls compose, which in turn returns an instance of Compose.

Is this syntax special enough and nice enough to add to our other classes in functoolz: curry, juxt, and excepts?

I have no preference here - I would personally default to not doing it. But the change would be as simple as making all of those classes derive from composable instead of from object.

Which implementation choices are best for typing for e.g. mypy?

I would guess that any alternative implementations would have the same type-safety/limitations as this one. The implementation from my previous comment (https://github.com/pytoolz/toolz/issues/523#issuecomment-938319699) is enough to make sure we keep all type information that we need. In particular, composable[P, R] is just like a Callable[P, R] with support for the pipe operator.

Yeah, I think there's enough interest in this to get it into toolz, so I say go for it!

Awesome! I can write a PR for it this weekend. Let me know if you disagree with any of the points above.

Regarding typing: I think we'll want to add typing to toolz eventually, so it's worth considering typing-friendliness. The trend, however, is that the longer we wait to add typing to toolz (e.g., the minimum Python version we choose to support), the better the typing situation will be. We'll get there someday, I promise!

You promised!! :laughing:

~Actually, inline type-annotations (like x: int) can be added as soon as toolz drops support for Python 3.5 :see_no_evil:~ Actually, PEP 484 got into Python 3.5, my bad! So we have all the syntax we need for this. But typing.ParamSpec is only available in Python 3.10. The alternative would be to add typing_extensions as a dependency to toolz. How do you feel about this?

If we don't have typing.ParamSpec or typing_extensions.ParamSpec, we can still implement composable, but we will lose type-information along the way. Probably we would have to rewrite

class composable(Generic[P, Y]):
    ...
    def __or__(self, other: Callable[[Y], Z]) -> composable[P, Z]:
        return compose(other, self)

as

class composable(Generic[Y]):
    ...
    def __or__(self, other: Callable[[Y], Z]) -> composable[Z]:
        return compose(other, self)

The huge downside of doing this is that we no longer know what parameters composable.__call__ accepts.

ruancomelli commented 2 years ago

I just wrote a PR for adding this feature to toolz :grin:

mentalisttraceur commented 2 years ago

A thought: using | for composition has a funny interplay with the type union | added in Python 3.10.

This is probably a non-issue, but I wanted to mention it so others can give it a think-over too....

For example, let's say I use the composable decorator in #531 on a class I'm defining:

@composable
class Foo:
    ...

Now Foo construction can be easily composed, but it lost the type union shorthand.

Also, the overload of | on the composable wrapper class instance takes precedence over type.__or__, so even one such class will "contaminate" any type union: str | int | Foo | Bar | Qux becomes compose(Qux, Bar, Foo, typing.Union[str, int]), which doesn't work either as a callable composition or as a type union.

Normally we can just say "then don't do that", but

  1. someone else may have defined Foo, and we might be just using their module without expecting to worry about "did the module I imported Foo from wrap it in a decorator that breaks | type union syntax?"

  2. that "someone else" could think "oh it could be nice to make all my classes composable" and not realize it creates a land mine for type hint breakage for users.

So I see approximately three reasonable options:

  1. just stay on course, trust the community of users to figure it out, and maybe add a warning in the docs about using @composable on classes,
  2. use a different operator than |, one that seems safer from future native-to-Python overload on any callables (I had a user suggest g+f for compose(g, f)), or
  3. have composable(a_callable) raise a TypeError if isinstance(a_callable, type) is true, and maybe hint in the docs that if you want a composable class constructor you can always wrap the class in a function (like how functoolz.compose wraps functoolz.Compose).

    (By the way, until thinking about this exact dilemma, I dismissed the Compose/compose split as just a way to satisfy rigidly literal style rules, but now I'm thinking this | overload type union dilemma might be evidence of a deeper "true" reason to maintain a clean split between types (that happen to incidentally also be callables which construct things of that type) and function-like callables (which are instances of some type but not types in themselves).)

mentalisttraceur commented 2 years ago

Continuing the above and replying to https://github.com/pytoolz/toolz/pull/531#issuecomment-1173684618 :

For what it's worth, in my library where I sketched out a solution for this, here's how I solved it:

  1. I had a composable decorator+wrapper whose | overload

    • takes precedence over type union |, so for example SomeClass | composable(str) would compose as str(SomeClass(...)) as expected;
    • has a general if not callable(other): return NotImplemented, and since type unions aren't callable, that forced cases like str | int | composable(float) to at least fail clearly and explicitly: TypeError: unsupported operand type(s) for |: 'types.UnionType' and 'composable'.
  2. I had a separate @composable_constructor decorator+wrapper, which would defer to type union |. This was for the people who want to decorate a class definition so that the class is always composable.

    • I was going to say something in my docs like "if you want to decorate a class so that calling the class as a constructor is composable like a function, use @composable_constructor instead of @composable - that way, normal class functionality such as | for type unions still works as developers would expect".

    • I also made it so that if some code gave you a class wrongly/unsuitably decorated with @composable, you could just wrap it with composable_constructor, and the outermost of those two wrappers would win, even other wrappers were layered on as well - so if you ever had a class you wanted to be absolutely sure properly participated in a type union, you could hit it with composable_constructor and not even have to worry about whether or not it was wrapped with composable. [I eventually ended removing this feature, because it was getting too complicated to handle this correctly in the edge-cases. But most cases are covered by accessing .__wrapped__ or using inspect.unwrap.]

    • (I also had a composable_instances decorator+wrapper, for the probably-more-common case where each instance of a class was wanted to be composable, rather than the constructor call - it basically just took the result of calling the class and wrapped it with composable.)

But... in my library this kind of split made more sense because I was trying to thoroughly foresee and handle the decorator and wrapper use-cases as well as possible. That's probably beyond the scope that toolz wants to cover.

For example, my library used wrapt to provide much more thorough transparency - you could slap @composable on a function and all introspection and attribute accesses would still work through the composable wrapper, and if it was a callable object with other features being wrapped then any other operator overloads that object had would still work through it as well, and so on. I doubt toolz wants to pull wrapt in as a dependency (then again, toolz has several wrapper classes which could probably become better if they were built on top of wrapt).

(click to expand details) Vaguely relevant aside, I also sketched out how I would do curry as a "proper" (debatable) wrapt-style wrapper combined with my `composable`. (Note that this variant doesn't support as many Python versions as toolz does, since it just assumes inspect.signature support, and positional-only argument syntax, though that's trivial to replace): ```python from functools import partial as _partial from inspect import signature as _signature @composable_constructor @composable_instances class curry(_CallableObjectProxy): def __init__(self, function): super().__init__(self, function) def __call__(self, /, *args, **kwargs): applied = _partial(self.__wrapped__, *args, **kwargs) signature = _signature(applied.func) try: signature.bind(*applied.args, **applied.keywords) except TypeError: pass else: return applied() signature.bind_partial(*applied.args, **applied.keywords) return composable(type(self)(applied)) __repr__ = composable.__repr__ __reduce_ex__ = composable.__reduce_ex__ __copy__ = composable.__copy__ __deepcopy__ = composable.__deepcopy__ ```

Anyway, personally, my journey when it comes to compose as an operator went like this:

  1. "nah, I don't think it's really that worth doing, and I don't want to be the one doing it",
  2. "wow, there are actually a lot of tricky pitfalls to this design space which I'd want to carefully get right, like composing async functions, error-handling, preserving introspectability, composing with other wrappers, Python suddenly recently using | for type unions... I'd hate to have myself or others dealing with problems caused by implementations that don't get those right, and users keep asking for this feature, so I guess I should go ahead and figure out a solution that takes care of all of that",
  3. "having done it, and sat on it for a little while, it turns out I just don't have the passion to provide it - I'm not the guy to carry this torch for others".

So that's all I have to contribute for now on the matter.

If I was going to do it, that's how I'd do it, and there's some very long-winded design ramblings in https://github.com/mentalisttraceur/python-compose-operator/issues/1 and https://github.com/mentalisttraceur/python-compose/pull/1 covering almost the full thought process leading up to why I did it that way, including several false-starts and dead-ends along the way.

But how I did it might not be the best fit for toolz, and I don't know what the best solution within toolz would be, if any.

mentalisttraceur commented 1 year ago

compose-operator is back! So you can get compose as | from there, and I'm hoping this enables experimentation and community experience gain which helps toolz design/implement its own version if it still wants to.

One neat thing that caused me to revive it is that just one tiny change made it magically elegantly combine with stuff like toolz.curry:

import operator

import toolz
from compose_operator import composable

curry = composable(toolz.curry)

add = curry(operator.add)

(add(1) | float | str)(1)  # returns "2.0"

It doesn't have type hints yet - it would probably be trivial to add type hints for the non-async case, but I'm hoping we can figure out the full sync-or-async type hints. See https://github.com/mentalisttraceur/python-compose-operator/issues/3 if you want to help or follow that.

(If type-checking matters more than the operator syntactic sugar for you, my compose implementation has type hints.)

I'll promote compose-operator to stable v1.0.0 within a day or so unless someone notices a serious issue (but there shouldn't be any - I gave it a lot of careful design thought last year, and the tests are fairly thorough).