rtosholdings / riptable

64bit multithreaded python data analytics tools for numpy arrays and datasets
https://riptable.readthedocs.io/en/stable/
Other
362 stars 21 forks source link

ema_decay does not handle `filter` argument correctly #173

Open ryan-gunderson opened 3 years ago

ryan-gunderson commented 3 years ago

Using version 1.5.2:

It appears that when supplying a a filter to ema_decay instead of using zero in the filtered positions it uses the last unfiltered value. While this might make sense for ema_normal it is incorrect for ema_decay.

ds = rt.Dataset({
    'time': rt.arange(10),
    'y': np.arange(10) % 2,
})
ds.group='A'

c = ds.cat('group')
ds.EDS_Full=c.ema_decay(ds.y, decay_rate=np.log(2), time = ds.time)
ds.EDS_Evens=c.ema_decay(ds.y, time=ds.time, decay_rate=np.log(2), filter=(ds.time%2==0))
ds.EDS_Odds=c.ema_decay(ds.y, time=ds.time, decay_rate=np.log(2), filter=(ds.time%2==1))

print(str(ds))

gives

#   time   y   group   EDS_Full   EDS_Evens   EDS_Odds
-   ----   -   -----   --------   ---------   --------
0      0   0       A       0.00        0.00       0.00
1      1   1       A       1.00        0.00       1.00
2      2   0       A       0.50        0.00       1.50
3      3   1       A       1.25        0.00       1.75
4      4   0       A       0.62        0.00       1.88
5      5   1       A       1.31        0.00       1.94
6      6   0       A       0.66        0.00       1.97
7      7   1       A       1.33        0.00       1.98
8      8   0       A       0.66        0.00       1.99
9      9   1       A       1.33        0.00       2.00

The EDS_Evens column is consistent with it only ever seeing zeros and the EDS_Odds column is consistent with it only ever seeing ones.

RCMcNamara commented 2 years ago

I think lines 1140 to 1145 in Ema.cpp (https://github.com/rtosholdings/riptide_cpp/blob/ccf5c3c8901378d6ff08272863aa430f26965016/src/Ema.cpp) are at least partly to blame. Deleting these lines makes it similar to the implementation in rt_fastarraynumba which works correctly. Of course, these only get called when pIncludeMask != Null. I am not sure if it does anything with the filter if that is not the case, but maybe there is a reason why having a filter always implies pIncludeMask is not Null. Regardless, I would try deleting those lines and testing it out afterward. I think that might fix the problem.