paddymul / buckaroo

Buckaroo - the data wrangling assistant for pandas. Quickly explore dataframes, and run pandas commands via a GUI. Works inside the jupyter notebook.
https://paddymul.github.io/buckaroo/
BSD 3-Clause "New" or "Revised" License
228 stars 10 forks source link

Syntantic sugar for polars styling #199

Open paddymul opened 11 months ago

paddymul commented 11 months ago

Checks

How would you categorize this request. You can select multiple if not sure

Developer ergonomics (defaults, error messages)

Enhancement Description

Annotating a dataframe or expression with metadata. I'm spitballing some ideas about annotating a dataframe or expression with metadata. looking for feedback.

I'm thinking about how to make configuring buckaroo easier. (but this applies to plotting liraries too). Is there a way to add metadata to expressions or dataframes?

I would like to be able to write code like this in a notebook

df.select(pl.col("date").buckaroo.format("datettime", fstring="YY-MM-DD"),
          pl.col("open").buckaroo.highlight(color_map="red_green", expression(pl.col("open").diff())))

I want to use the buckaroo namespace to append metadata to the expression that doesn't affect polars at all, but Buckaroo can access it later.

In the above example the date column would be formatted with a special format string, and the open column would be colored based on if the open for this day was higher or lower than the previous day.

The alternative would be typing things like

BW = BuckarooWidget
BW(
    df.select(pl.all(), 
              pl.col('open').diff().alias('open_diff')),
    column_extras={"date": { "format" : {"type": "datettime", "fstring": "YY-MM-DD"}},
                           "open": { "highlight" : {color_map:"red_green", on_column:"open_diff"}},
                               "open_diff": {"hidden": True}})
  1. is this possible? I don't think so, but I haven't dived into extensions that hard.
  2. If Polars doesn't want to give an explicit metadata facility, could it at least keep the query/expression graph around in the resulting dataframe. This way I think I could pick the correct elements off the expression json.

Pseudo Code Implementation

No response

Prior Art

N/A