markfairbanks / tidypolars

Tidy interface to polars
http://tidypolars.readthedocs.io
MIT License
337 stars 11 forks source link

Possible to remove col() from .mutate() etc.? #218

Closed PathosEthosLogos closed 1 year ago

PathosEthosLogos commented 1 year ago

I know it's just three functions that require col(), but I think users would generally find it more streamlined for use without the existence of col(). Just wondering if it's possible...?

markfairbanks commented 1 year ago

Unfortunately it's not possible to get rid of col(). Building expressions using col() is core to how polars works.

It also has to do with how python interprets variables - it tries to evaluate variables in the context of the environment you evaluate it in (typically the global environment). In R you can tell an expression to evaluate in the context of the data frame (which is how dplyr and data.table work without using something like col()).

One cool advantage of this for polars/tidypolars though is you can build expressions outside of .mutate()/.summarize()/etc. and then evaluate them within the function you need.

import tidypolars as tp
from tidypolars import col

df = tp.Tibble(x = range(3), y = range(3))

expr = col('x') * 2

df.mutate(double_x = expr)
┌─────┬─────┬──────────┐
│ x   ┆ y   ┆ double_x │
│ --- ┆ --- ┆ ---      │
│ i64 ┆ i64 ┆ i64      │
╞═════╪═════╪══════════╡
│ 0   ┆ 0   ┆ 0        │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 1   ┆ 1   ┆ 2        │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ 2   ┆ 4        │
└─────┴─────┴──────────┘

If you have any questions let me know 😄