API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
We need some helper functions that can easily group timeseries and roll them into a tensor/vector format suitable as a feature vector for model training & scoring.
The functions should allow for collection of values / aggregation across 2 dimensions:
group & aggregate across multiple timeseries (multivariate inputs)
build up tensor / vector representations across a window over timeseries
eg: a 60-day lookback feature tensor across 5 related timeseries (producing 60x5 tensors for each training sample)
This function may be a special case of a more generic rolling-window type aggregation.
We need some helper functions that can easily group timeseries and roll them into a tensor/vector format suitable as a feature vector for model training & scoring.
The functions should allow for collection of values / aggregation across 2 dimensions:
This function may be a special case of a more generic rolling-window type aggregation.