Open Julian-J-S opened 1 year ago
Yes, I think this should definitely be possible.
@ritchie46 Why does .rolling()
not use the same argument structure as the other rolling_
functions? They have the window size / period as the first argument, with by
as an optional keyword argument.
In fact the documentation for .rolling() still references the (non-existing) by
argument.
Yes, I think this should definitely be possible.
@ritchie46 Why does
.rolling()
not use the same argument structure as the otherrolling_
functions? They have the window size / period as the first argument, withby
as an optional keyword argument.In fact the documentation for .rolling() still references the (non-existing)
by
argument.
I also agree that the rolling method should support directly specifying the window size, making it more versatile, for example, it can be used in eval context
pl.col.xxx.list.eval(pl.element().mean().rolling(10))
Yes, I think this should definitely be possible.
@ritchie46 Why does
.rolling()
not use the same argument structure as the otherrolling_
functions? They have the window size / period as the first argument, withby
as an optional keyword argument.In fact the documentation for .rolling() still references the (non-existing)
by
argument.
Because the rolling_
functions are specialized implementations that support this.
The rolling
postfix can run all expressions that are before this and thus we cannot guarantee we can run the specialized behavior. The rolling postfix is restricted to what we can support at the moment.
And @orlp I think we are now conflating the expr rolling and Lazyframe rolling.
In any case, we should accept an expression as index_column
. Then you can pass pl.arange(0, pl.count())
.
I have been thinking about this a lot and I think this functionality does NOT fit in the current rolling
concept and is closer to the group_by_dynamic
function. (maybe we should rethink the naming here)
Let me try to give an overview and comparison of rolling
(polars), group_by_dynamic
(polars) and rolling
(pandas).
index_column
going back period
of timeindex_column
of the current rowindex_column
creates fixed size windows (length period
) with fixed time steps (length every
)index_column
window
) with fixed steps (length step
) withouth any index (only the order of the rows)window
)step
is 1I think the integer-window functionality of pandas is very useful and should be implemented in polars.
It could get its own function or be added to group_by_dynamic
Sidenote:
imo it is currently a bit confusing that rolling
and group_by_dynamic
are related but have very different names and pandas rolling
might be closer to group_by_dynamic
than rolling
in polars. Also I dont understand why group_by_dynamic
is called dynamic, because it is not dynamic at all. It is just a fixed size window with equidistant time steps.
Description
the current
rolling
function allows for lots of cool analysis but as far as I can tell the basic use case of rolling over rows without having any index is not supported?Example:
Pandas
Polars
As you can see it is very verbose to achieve this basic functionaliy. I also saw that there are functions like
rolling_sum
/rolling_<...>
) that support windows of fixed rows as well as temporal. It would be great to bring the same functionality to the more genericrolling
function.Desired solution