For stream usecases, we need to predict for a datapoint that is yet to be published by the stream.
How
As detailed in the related issue M#1171, specifying make_predictions==False for the predict DataFrame will trigger the inferring policy. At the moment, it extracts the temporal data for the order_by column by comparing the last two entries in the stream cache. This could be changed to consider the complete cache, or even previous values. On the other hand, the values for the rest of the columns is carried over from the last row (here we could explore imputing techniques, basically treating the row as missing data and generating values conditioned on the cache).
Why
For stream usecases, we need to predict for a datapoint that is yet to be published by the stream.
How
As detailed in the related issue M#1171, specifying
make_predictions==False
for the predict DataFrame will trigger the inferring policy. At the moment, it extracts the temporal data for theorder_by
column by comparing the last two entries in the stream cache. This could be changed to consider the complete cache, or even previous values. On the other hand, the values for the rest of the columns is carried over from the last row (here we could explore imputing techniques, basically treating the row as missing data and generating values conditioned on the cache).