Open hsorsky opened 4 years ago
@hsorsky thanks for the report.
This looks similar to #32240
AFAICT, the returned index is probably correct, but the original groupby_col
should not be dropped from the results with as_index=False
@simonjayhawkins This has actually a different origin. The rolling
function transforms the DataFrameGroupBy
into a RollingGroupBy
object, which has a completly different aggregate function. The RollingGroupBy
object stores the as_index
flag under obj._groupby.as_index
. The flag is never accessed. So the code execution is independent of this flag when using rolling with aggregate.
Edit: That was a bit misleading. The actual aggregeation is done the same way. But the process steps after the aggregation are completly different. The rolling part runs into pandas/core/window/rolling.py
line 603
I'm not sure what the right output here is. On the one hand, doing .groupby(...).rolling(...).agg(...)
is a transformation (always has exactly one row per input row). Should we adhere to the semantics of a transformation? That would mean this operation should ignore as_index
and the current index is incorrect (it should have the same index as the input DataFrame).
On other other hand, users often seem confused that as_index
is ignored for transforms. I personally think we should expand on the options where the groups are in the result (or not). This is #49543.
I agree. It is more sensible to ignore as_index
and return index as is for rolling
, ewm
and expanding
transformations. At least until a decision is made for #49543 I guess? But right now only agg(dict_like)
and agg(list_like)
behave as such. .agg(string)
, .agg(callable)
or functions such as .mean
, .sum
alter index even though all are doing a transformation. #54973 is just fixing this inconsistency
I agree. It is more sensible to ignore as_index and return index as is for rolling, ewm and expanding transformations.
https://github.com/pandas-dev/pandas/pull/54973 is just fixing this inconsistency
Agreed that fixing the inconsistency is positive, but I think we want to avoid changing behavior on users as a bugfix and then changing it again as a bugfix if we can just do it all at once.
Fair enough. I can make changes so that all rolling
, ewm
and expanding
behaves like other transformations does or shall I left this to be implemented along with changes proposed in #49543?
Problem description
When using a rolling agg function on a groupby object, we cannot ommit the groupby columns from the resulting dataframe's index using
as_index = False
, like we can when applying a non-rolling agg function.Code Sample
Expected Output
Actual Output
Output of
pd.show_versions()