EntilZha / PyFunctional

Python library for creating data pipelines with chain functional programming
http://pyfunctional.pedro.ai
MIT License
2.41k stars 132 forks source link

Add the split transformation #175

Closed AlexandreKempf closed 1 year ago

AlexandreKempf commented 1 year ago

Hello everyone,

This is my first PR ever, but I'm willing to learn, so don't hesitate to be picky.

I was doing the advent of code 2022 with a constraint to use only Pyfunctional so that I could learn the library. I was blocked on the first problem because I felt it was missing a split function.

The goal of such a function is to take a sequence and split it into several lists given a function. The function maps over the sequence and splits if it returns True or continues if it returns False. An example is worth a million words, so here it is:

seq([1,2,3,None,4,5,6,7,None,8,9]).split(lambda x: x is None) 
>>> seq([ [1,2,3], [4,5,6,7], [8,9] ])

I tried to respect the rules for PR, but if I missed something feel free to comment ;)

Have a good day!

EntilZha commented 1 year ago

Hi @AlexandreKempf, thanks for the PR! One thing you could look into is using functional.extend to register your own functions https://github.com/EntilZha/PyFunctional/issues/113.

For adding functions to the API, I like to at least know it has precedent elsewhere (e.g., in Apache Spark, Scala collections, Rust iterables, or other similar FP libraries). Do you know if there is a split function somewhere? Not to say it is required, but would be great to know :).

Thanks!

AlexandreKempf commented 1 year ago

Hi EntilZha, and thank you for your reply !

I discovered the extend method earlier today and I was amazed :+1: ! I'm not sure I would have done the PR knowing extend but here we are :p

I couldn't find exactly such functions, but there are similar functions in Elixir (chunk_by) and in toolz (partition_by). However, they are splitting not based on a boolean but on a change of output from the function.

If you're interested in a split function, what behavior would you like to have ?

Best,

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

EntilZha commented 1 year ago

Been busy, sorry for late reply. I'm not aware of prior functions that do that, so would lean towards using this via extend for now. If its a pretty common use case, then could reconsider.