pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.57k stars 17.56k forks source link

Allow adjust=False when times is provided #59142

Open tserrao opened 2 days ago

tserrao commented 2 days ago

This change enables EWMA to be calculated recursively ("infinite history", i.e. adjust=False) for irregular-interval time series. Previously new_wt was held constant in the ewm run loop, but now new_wt is updated each step through the series under the narrow condition that adjust=False and com=1.

There may be a more elegant expression that will generalize to com values different than one, but using 1 - old_wt works nicely if halflife is the only provided decay parameter since we are guaranteed com=1. The additional parameter restrictions to force this condition seem acceptable since there is already some restriction on decay parameters.

Exactly one of com, span, halflife, or alpha must be provided if times is not provided. If times is provided, halflife and one of com, span or alpha may be provided.

FWIW I confirmed that results match polars and @azmyrajab's nice test function:

ema_no_adjust = ema_test(vals, dt_seconds, half_life=half_life_seconds)[-1]
ema_no_adjust_pl = ema_polars(df["val"].rename_axis(index="ts").rename("val"), half_life,
                              by="ts").select("val")[-1].item()

print(ema_no_adjust)
print(ema_no_adjust_pl)
print(df.ewm(halflife=half_life, times=id, adjust=False).mean().iloc[-1]['val'])

0.2062994740159002
0.2062994740159002
0.2062994740159002