JulienPalard / Pipe

A Python library to use infix notation in Python
MIT License
1.96k stars 113 forks source link

Adding batch pipe #85

Closed YashAgarwal closed 1 year ago

JulienPalard commented 1 year ago

It's funny it's not implemented yet!

What about a shorter implementation, relying a bit more on the stdlib, like:

    iterator = iter(iterable)
    while chunk := tuple(islice(iterator, size)):
        yield chunk

?

I've not tested it though.

eloonstra commented 1 year ago

I think the solution above is nice but I'd personally do something like:

for i in range(0, len(length), size): yield iterable[i:i + size]

JulienPalard commented 1 year ago

iterable[i:i + size] won't work on generators, while islice works. Also len(iterator) can't work on generators, see:

from pipe import Pipe, take
from itertools import count, islice

@Pipe
def chunks(iterable, size):
    iterator = iter(iterable)
    while chunk := tuple(islice(iterator, size)):
        yield chunk

for chunk in count() | chunks(5) | take(10):
    print(chunk)

# prints:
# (0, 1, 2, 3, 4)
# (5, 6, 7, 8, 9)
# (10, 11, 12, 13, 14)
# (15, 16, 17, 18, 19)
# (20, 21, 22, 23, 24)
# (25, 26, 27, 28, 29)
# (30, 31, 32, 33, 34)
# (35, 36, 37, 38, 39)
# (40, 41, 42, 43, 44)
# (45, 46, 47, 48, 49)

@Pipe
def batch(iterator, size):
    for i in range(0, len(iterator), size):
        yield iterable[i:i + size]

for chunk in count() | batch(5) | take(10):
    print(chunk)

# TypeError: object of type 'itertools.count' has no len()
JulienPalard commented 1 year ago

Implemented it in a8f0226c8beec6b4d9dc9714c338404da53d7006, thanks for the idea!