Closed FBruzzesi closed 6 days ago
I am closing this issue as completed for now although these expr won't be available in group_by
context. I think for now it would be a bit too hard to support them, although it would definitly be a nice to have for the over
use case.
Even for pandas, even though DataFrameGroupBy
has cumsum
and other cumulative operations, its behaviour seems a bit unexpected as the group keys are not maintained in the output. Example from the doc itself:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 6, 9]]
>>> df = pd.DataFrame(data, columns=["a", "b", "c"],
... index=["fox", "gorilla", "lion"])
>>> df
a b c
fox 1 8 2
gorilla 1 2 5
lion 2 6 9
>>> df.groupby("a").cumsum()
b c
fox 8 2
gorilla 10 7
lion 6 9
As you can see, the output has no column "a", not even in the index
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
Cumulative features, together with forward fill and some other checks/hacks, would most likely be enough to enable the equivalent of pandas expanding operations. This is a requirement to complete https://github.com/plotly/plotly.py/issues/4834.
Please describe the purpose of the new feature or describe the problem to solve.
List of cumulative expressions supported by polars:
With these, we would enable the following additional univariate expanding operations: mean, var, std, skew, kurt.
What's left out is:
median
,quantile
andrank
- I don't think we would be able to implement those 🥲 (entire pandas expanding window function list).Group by context
Edit: Additionally, we should support these expr in group by's context. This is partially possible:
reverse=False
(default argument), forcum_<min|max|sum|prod>
(need to check how DataFrameGroupBy.cumcount behaves with nulls.__iter__
, which I cannot say how slower it is than native methodsFor the moment I would keep these out of the PRs introducing the methods in the first place. Thanks @AlessandroMiola to point that out in #1384