JulienPalard / Pipe

A Python library to use infix notation in Python
MIT License
1.96k stars 113 forks source link

add possibility to chain pipes before execution #88

Open hubald opened 1 year ago

hubald commented 1 year ago
JulienPalard commented 1 year ago

Interesting!

This probably completly changes the way a long Pipe is built by the interpreter, before it was "left to right", as there was only rors, now it builds right to left with ors, up to the data wich is eaten by the ror? I've not tried it yet.

Do you have any real use case for this?

hubald commented 1 year ago

No, it's evaluated like before - from left to right. I stepped through the code. OR-operator is used only if no iterable was passed. My real-life use case is that I pass different Pipe methods to an evaluation function as parameter - and here comes the problem that the Pipes have to be assigned to a variable and hence are "evaluated" without the iterable first. Something like the following:

class Database:
...
  def filter_lines_by(predicate):
    filtered_lines = list(self.lines | predicate) # __ror__ is used
    #further processing

db = Database()
errors = db.filter_lines_by(has_error_code | reported_by_end_customer | older_than_days(30)) # __or__ is used
JulienPalard commented 1 year ago

I understand the use case.

Can you please add some explanations in the README about it? Maybe with your use case as an example, or someting inspired from your use case?

I'll have to run it step by step too to correctly ingest how it works before merging, I'll have the time next week I hope.

I still think it changes the "building" order (as __or__ should be picked with a higher priority than __ror__), which is not an issue: I also think that it changes nothing at runtime.

I mean, in a|b|c I think it is currently built via c.__ror__(b).__ror__(a), while with your PR I think it's built via: b.__or__(c).__ror__(a). Anyway, I'll inspect this closely after holidays :)

jenisys commented 1 year ago

Interesting (now I understand why sspipe provided operator.ror and operator.or).

There is at least one other solution to provide pipeline recipes (sorry, I needed to come up with a name for it). A pipeline recipe is a prepared, but often complex pipeline that can be easily be applied to any iterable. But the solution here covers only part of what is suggested above.

def test_pipeline_recipe_with_many_pipes():
    @Pipe
    def select_even_numbers_and_multiply_by(iterable, factor):
        yield from (iterable | select_even_numbers | multiply_by(factor))

    numbers = range(8)
    results = list(numbers | select_even_numbers_and_multiply_by(3))
    expected = [0, 6, 12, 18]
    assert results == expected

@JulienPalard Maybe you could add a section in the README with this information (or I provide a pull-request).

JulienPalard commented 1 year ago

Took the time to review your code, looks like this PR needs love...

It feel strange that your __or__ returns a Pipe that accepts arguments. And it feels arbitrary to pass those arguments to the first Pipe of the named chain.

In other words, with your code, this works:

batch_any = batched | take(2)
batch_two = batch_any(2)

print(list("hello world" | batch_two))  # Gives [('h', 'e'), ('l', 'l')]

Here batched is a "partial Pipe awaiting arguments", so batch_any is still partial on its first part, and batch_two is created by fulfilling the missing arg.

Why not giving the arguments to the last one:

batch_any = batched(2) | take
batch_two = batch_any(2)

?

Or why not "dispatching the right named argument to the right pipe"?

In the face of ambiguity, refuse the temptation to guess.

so I propose simplifying a bit and refuse extra arguments:

    def __or__(self, other):
        return self.__class__(lambda iterable: other.function(self.function(iterable)))

how does it looks like to you?