dry-python / returns

Make your functions return something meaningful, typed, and safe!
https://returns.rtfd.io
BSD 2-Clause "Simplified" License
3.5k stars 116 forks source link

Add transducers support #610

Open sobolevn opened 4 years ago

sobolevn commented 4 years ago

https://medium.com/javascript-scene/transducers-efficient-data-processing-pipelines-in-javascript-7985330fe73d https://dev.to/romanliutikov/understanding-transducers-in-javascript-4pdg

Currently we use very straight-to-action helpers. Sometimes we might need more efficient ones. Here why we need transducers.

thepabloaguilar commented 3 years ago

https://ramdajs.com/docs/#expand

thepabloaguilar commented 3 years ago

https://github.com/cognitect-labs/transducers-python

thepabloaguilar commented 3 years ago

We can implement based in the link above, it was inspired by Clojure implementation that has some cool features like early termination using Reduced sentinel!

In fact almost everything there can be translated to returns using typing, specially the transducers parts! I've written the transducers for map and filter:

def mapping(
    function: Callable[[_T], _T],
) -> Callable[
    [Callable[[List[_T], _T], List[_T]]],
    Callable[[List[_T], _T], List[_T]],
]:
    def reducer(
        reducing: Callable[[List[_T], _T], List[_T]],
    ) -> Callable[[List[_T], _T], List[_T]]:
        def map_(
            collection: List[_T],
            item: _T,
        ) -> List[_T]:
            return reducing(collection, function(item))
        return map_
    return reducer

def filtering(
    predicate: Callable[[_T], bool],
) -> Callable[
    [Callable[[List[_T], _T], List[_T]]],
    Callable[[List[_T], _T], List[_T]],
]:
    def reducer(
        reducing: Callable[[List[_T], _T], List[_T]],
    ) -> Callable[[List[_T], _T], List[_T]]:
        def filter_(
            collection: List[_T],
            item: _T
        ) -> List[_T]:
            if predicate(item):
                return reducing(collection, item)
            return collection
        return filter_
    return reducer

__P.S.: The types in my code are wrong yet 'cause transducers should work with any kind of input and there I'm fixing List[_T]__

sobolevn commented 3 years ago

Great stuff!

thepabloaguilar commented 3 years ago

Maybe is because I'm so tired now after many hours in front of a computer, but I can't figure out how to solve the problem I'm getting.

Transducer map implementation:

_ValueType = TypeVar('_ValueType')
_NewValueType = TypeVar('_NewValueType')

_AccType = TypeVar('_AccType')
_NewAccType = TypeVar('_NewAccType')

def tmap(
    function: Callable[[_ValueType], _NewValueType],
) -> Callable[
    [Callable[[_AccType, _NewValueType], _AccType]],
    Callable[[_AccType, _ValueType], _AccType],
]:
    def reducer(
        reducing: Callable[[_AccType, _NewValueType], _AccType],
    ) -> Callable[[_AccType, _ValueType], _AccType]:
        def map_(acc: _AccType, value: _ValueType) -> _AccType:
            return reducing(acc, function(value))
        return map_
    return reducer

Every type seems correct:

But when I try to make a composition I got a error, but it should work!

Test case:

from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

f_step = tmap(to_str)
second_step = f_step(tmap(to_str))

Test output:

E   Actual:
E     main:14: error: Argument 1 has incompatible type "Callable[[Callable[[_AccType, Any], _AccType]], Callable[[_AccType, Any], _AccType]]"; expected "Callable[[Any, str], Any]" (diff)
E     main:14: error: Cannot infer function type argument (diff)

Like, Any shouldn't be there.


Other test cases works fine:

from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

reveal_type(tmap(to_str))  # N: Revealed type is 'def [_AccType] (def (_AccType`-3, builtins.str*) -> _AccType`-3) -> def (_AccType`-3, builtins.int*) -> _AccType`-3'
from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

def append(collection: List[str], item: str) -> List[str]:
    ...

reveal_type(tmap(to_str)(append))  # N: Revealed type is 'def (builtins.list*[builtins.str], builtins.int) -> builtins.list*[builtins.str]'
from typing import List
from returns.transducers import tmap

def to_str(number: int) -> str:
    ...

def append(collection: List[str], item: str) -> List[str]:
    ...

my_list: List[str]
reveal_type(tmap(to_str)(append)(my_list, 2))  # N: Revealed type is 'builtins.list*[builtins.str]'
thepabloaguilar commented 3 years ago

I'd love to know if you have any tips @sobolevn!!

thepabloaguilar commented 3 years ago

Btw, the actual implementation is working normally:

>>> def append(a, v):
...     a.append(v)
...     return a

>>> tmap(str)(tmap(int)(append))([], 1)
[1]
thepabloaguilar commented 3 years ago

Ok, I think the composition I've made is wrong! I'll see tomorow

thepabloaguilar commented 3 years ago

@sobolevn I remember a discussion from #451, what will be a great name for this feature?

Options:

thepabloaguilar commented 3 years ago

Ok, I think the composition I've made is wrong! I'll see tomorow

Well, my composition was wrong and my thought too hahaha

The types are working correctly with the composition:

>>> from returns.transducers import tmap
>>> append = lambda a, v: a.append(v) or a
>>> tmap(int)(tmap(str)(append))([], '1')
['1']
thepabloaguilar commented 3 years ago

@sobolevn You can see the progress in this draft PR (yet) or in a very preview documentation here

thepabloaguilar commented 3 years ago

I have a suggestion, in the actual implementation way the user need to build "everything" by hand like:

xform = compose(
    tmap(func),
    tfilter(func),
)

result = transduce(
    xform,
    reducing_func,
    initial,
    data_to_process,
)

My ideia is to incapsulate in a class:

t = (
    Transformation
        .map(func)
        .filter(func)
)

result = t(
    reducing_func,
    initial,
    data_to_process,
)

Using a class makes clear to the users that they are receiving a transformation instead of a random function!

sobolevn commented 3 years ago

Yeah, why not!

thepabloaguilar commented 3 years ago

Well, after some tests I can't find a great solution design 😞 The user most provide a function, and this function can be different (map != filter) My idea now is to provide .from_(map, filter) method to initialize the instance, but I think it is not worth it! For each transducer function we will need to add an initialization method.

Do you have any suggestion @sobolevn?? If not, I'll continue with the normal solution

sobolevn commented 3 years ago

For each transducer function we will need to add an initialization method.

Sorry, I don't understand. Can you please show me the code?

thepabloaguilar commented 3 years ago

Sure, but think in our Result container where we have .from_value(...), .from_failure(...). It's the same concept but will we have more than two from_*:

class Transformation(Generic[_AccValueType, _ValueType]):
    def map(
        self,
        function: Callable[[_ValueType], _NewValueType])
    ) -> 'Transformation[_AccValueType, _NewValueType]':
        ...

    @classmethod
    def from_map(
        cls,
        function: Callable[[_ValueType], _NewValueType])
    ) -> 'Transformation[_AccValueType, _NewValueType]':
        ...

    def filter(
        self,
        function: Callable[[_ValueType], _ValueType])
    ) -> 'Transformation[_AccValueType, _ValueType]':
        ...

    @classmethod
    def from_filter(
        cls,
        function: Callable[[_ValueType], _ValueType])
    ) -> 'Transformation[_AccValueType, _ValueType]':
        ...

    # OTHERS TRANSDUCERS THAT HAVE DIFFERENT FUNCTIONS SIGNATURES
thepabloaguilar commented 3 years ago

Oh, wait. I think I cannot do that, the idea behind transducers is to remove the rule from the process. Transducers don't care about the user input or output. If the users make Transformations[List[str], int] they will be putting this rule again! 🤔

thepabloaguilar commented 3 years ago

This is what I will do, continue with the first approach and then I can think about improvements!

thepabloaguilar commented 3 years ago

@sobolevn I need your help a little, I'm planning to finish the implementation this weekend!

We have the transduce function right here:

def transduce(
    xform: Callable[
        [Callable[[_AccValueType, _ValueType], _AccValueType]],
        Callable[[_AccValueType, _ValueType], _AccValueType],
    ],
    reducing_function: Callable[[_AccValueType, _ValueType], _AccValueType],
    initial: _AccValueType,
    iterable: Iterable[_ValueType],
) -> _AccValueType:
    ...

Sometimes I just want to pass an empty list to the initial value, like:

transduce(
    xform=tfilter(is_even),
    reducing_function=append,
    initial=[],
    iterable=my_list,
)

And I receive this error from mypy:

error: Argument "reducing_function" to "transduce" has incompatible type "Callable[[List[int], int], List[int]]"; expected "Callable[[List[<nothing>], int], List[<nothing>]]"

To fix this issue I need to type hint that list:

initial_list: List[int]

transduce(
    xform=tfilter(is_even),
    reducing_function=append,
    initial=initial_list,
    iterable=my_list,
)

Do you know a way to avoid that??

sobolevn commented 3 years ago

Do you know a way to avoid that??

No, this is pretty annoying bug in how mypy works, the thing is: [] is sometimes List[never] and sometimes List[any]. Right now it is not working this way 😞

thepabloaguilar commented 3 years ago

Sad 😭 😭 So, I'll continue as is!

thepabloaguilar commented 2 years ago

I'm continuing this issue and after a lot of thoughts I end up deciding that transducers is a cool thing but we don't need them in Python in the same way that's implemented in Clojure! The thing is, Python has generators which chained result in the same behavior of the transducer, instead of creating a lot of intermediate iterable they process each element from the start to the end!

>>> l = [1, 2, 3, 4, 5]
>>> def mapf(num: int) -> int:
...     print(num)
...     return num
...
>>> def filterf(num: int) -> bool:
...     print(num)
...     return True
...
>>> list(filter(filterf, map(mapf, l)))
1
1
2
2
3
3
4
4
5
5
[1, 2, 3, 4, 5]

But I do think it's useful for us to provide a better tooling for generators like we have on Clojure transducers! So my idea is to implement that tolling now!

WDYT @sobolevn??

sobolevn commented 2 years ago

The idea sounds interesting! Please, feel free to send a prototype.