uncharted-distil / distil-auto-ml

Distil Automated Machine Learning Server
Apache License 2.0
2 stars 1 forks source link

Add pipeline to use NBEATS timeseries forecasting #209

Closed cdbethune closed 3 years ago

cdbethune commented 4 years ago

There's a new timeseries forecasting primitive available - https://github.com/kungfuai/d3m-primitives/tree/master/kf_d3m_primitives/ts_forecasting/nbeats. A pipeline should be added to run it, and we should check with @jlgleason as to how / if it should support hyperparameter tuning.

jlgleason commented 4 years ago

The NBEATSPrimitive has a very similar set of hyperparameters to DeepAR.

TuningParameters: epochs, steps_per_epoch, learning_rate, training_batch_size, and num_estimators.

ControlParameters: prediction_length, interpretable, num_context_lengths, inference_batch_size, output_mean (I've also updated prediction_length and inference_batch_size to be ControlParameters in DeepArPrimitive as well, because that is more appropriate)

Like in DeepArPrimitive, prediction_length is the most important hyperparameter. Any predictions requested in produce() that are more than prediction_length steps in the future will be returned as np.nan. Additionally, if prediction_length is greater than the longest time series in the training set, the primitive will exit gracefully before training with an error message. Occasionally, I've also seen training errors when prediction_length is close to the length of the longest time series in the training set (for both DeepArPrimitive and NBEATSPrimitive), but I haven't figured out a better measure for when to exit gracefully before training than prediction_length (e.g. prediction_length + context_length seems to be too conservative, but perhaps being more conservative/robust is preferable?). Please let me know if you see errors like this though!