markfairbanks / tidypolars

Tidy interface to polars
http://tidypolars.readthedocs.io
MIT License
337 stars 11 forks source link

Consider tidyverse piping interface #208

Open markfairbanks opened 2 years ago

markfairbanks commented 2 years ago

Proposed interface:

import polars as pl
from tidypolars import col, arrange, filter, mutate

df = pl.DataFrame(dict(x = range(3), y = range(3), z = range(3)))

(
    df >>
    arrange('x') >>
    filter(col('x') <= 2) >>
    mutate(double_x = col('x') * 2)
)

Positives:

Negatives:

alexkyllo commented 2 years ago

I like this and think there's good precedent for >> as an R-style pipe operator in Python, as both dplython and siuba use it.

markfairbanks commented 1 year ago

Another option: https://github.com/pola-rs/polars/pull/5531

This would allow using a tp namespace on a polars data frame

df = pl.DataFrame({"a": [1, 2], "b": [3, 4]})

(
    df
    .tp.mutate(double_a = col('a') * 2)
    .tp.filter(col('a') < 3)
    .tp.arrange('a', 'b')
)
alexander-beedie commented 1 year ago

I think a tidy namespace would be great here, now that you can register them - it's ideal for library authors as you get to extend the relevant classes without subclassing/mixins, or requiring changes in the core, and you are better protected from changes to the internals.

Plus: proper autocomplete :)

The << and >> operators may get used for bitshift ops (their usual meaning), so I would consider them as being reserved for future polars/core use, rather than as a potential API extension point for external packages - otherwise you might get broken again :(

FYI, docs/examples relating to namespace registration are live, though you'll need to wait for >= 0.14.29 to use: https://pola-rs.github.io/polars/py-polars/html/reference/api.html

markfairbanks commented 1 year ago

Thanks for the info @alexander-beedie 😄

The << and >> operators may get used for bitshift ops (their usual meaning), so I would consider them as being reserved for future polars/core use, rather than as a potential API extension point for external packages - otherwise you might get broken again :(

This has been my worry and why I hadn't updated tidypolars with this yet. The new feature you implemented in https://github.com/pola-rs/polars/pull/5531 seems like a much better way to go.

alexander-beedie commented 1 year ago

This has been my worry and why I hadn't updated tidypolars with this yet. The new feature you implemented in pola-rs/polars#5531 seems like a much better way to go.

And it's available now - new version released a few hours ago ;)