Closed debackerl closed 4 years ago
Try setting lags_seq = [1] and test again?
By default it will be using a series of lagging target values, which is unnecessary for your dataset.
Hi @debackerl ,
thank you for the interesting experiment. @davidlkl is correct - the lagged target value will probably hurt in this case.
In fact, I think there are a couple of model choices in DeepAR that do not go well with this kind of problem.
I did the following:
I removed the default time features in the DeepAR estimator:
#AddTimeFeatures(
# start_field=FieldName.START,
# target_field=FieldName.TARGET,
# output_field=FieldName.FEAT_TIME,
# time_features=self.time_features,
# pred_length=self.prediction_length,
#),
#AddAgeFeature(
# target_field=FieldName.TARGET,
# output_field=FieldName.FEAT_AGE,
# pred_length=self.prediction_length,
# log_scale=True,
# dtype=self.dtype,
#),
#VstackFeatures(
# output_field=FieldName.FEAT_TIME,
# input_fields=[FieldName.FEAT_TIME, FieldName.FEAT_AGE]
# + (
# [FieldName.FEAT_DYNAMIC_REAL]
# if self.use_feat_dynamic_real
# else []
# ),
#),
VstackFeatures(
output_field=FieldName.FEAT_TIME,
input_fields=[FieldName.FEAT_DYNAMIC_REAL]),
And I used this snipplet:
import pandas as pd
import numpy as np
import mxnet as mx
from gluonts.dataset.common import ListDataset
from gluonts.dataset.field_names import FieldName
from gluonts.evaluation import Evaluator
from gluonts.evaluation.backtest import make_evaluation_predictions
from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer
from gluonts.distribution.laplace import LaplaceOutput
t0 = pd.Timestamp(year=2000, month=1, day=1, freq='B')
terms = np.random.rand(10000) * 2.0 - 1.0
walk = np.cumsum(terms)
ctx = mx.cpu()
#terms = np.roll(terms, 1)
context_length, prediction_length = 1, 1
# At time t, model knows previous value at t-1, and new term/increment at time t, giving full information
train_ds = ListDataset([{FieldName.START: t0, FieldName.TARGET: walk, FieldName.FEAT_DYNAMIC_REAL: [terms]}], freq=t0.freq)
trainer = Trainer(ctx=ctx, epochs=200, batch_size=16, num_batches_per_epoch=50)
estimator = DeepAREstimator(freq='B', num_layers=1, num_cells=5, trainer=trainer, context_length=context_length, prediction_length=prediction_length, use_feat_dynamic_real=True, lags_seq=[1])
predictor = estimator.train(training_data=train_ds)
# test is a subset of train set, so I don't even test generalization, I simply test learning
test_ds = ListDataset([{FieldName.START: t0 + t*t0.freq, FieldName.TARGET: walk[t-estimator.history_length:t+prediction_length], FieldName.FEAT_DYNAMIC_REAL: [terms[t-estimator.history_length:t+prediction_length]]} for t in range(1000, 2000)], freq=t0.freq)
forecast_it, ts_it = make_evaluation_predictions(dataset=test_ds, predictor=predictor, num_samples=1000)
evaluator = Evaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, series_metrics = evaluator(ts_it, forecast_it, num_series=len(test_ds))
print(agg_metrics)
which gives me this result:
{'MSE': 0.0026946234173055926, 'abs_error': 41.87844657897949, 'abs_target_sum': 22620.477743148804, 'abs_target_mean': 22.620477743148804, 'seasonal_error': 0.48385462474823, 'MASE': 0.24065510051029348, 'sMAPE': 0.0019129342584074318, 'MSIS': 1.343544394708052, 'QuantileLoss[0.1]': 16.50373649597168, 'Coverage[0.1]': 0.016, 'QuantileLoss[0.5]': 41.87844657897949, 'Coverage[0.5]': 0.596, 'QuantileLoss[0.9]': 19.33512268066406, 'Coverage[0.9]': 0.956, 'RMSE': 0.051909762254373625, 'NRMSE': 0.0022948128171207983, 'ND': 0.0018513511100208067, 'wQuantileLoss[0.1]': 0.0007295927470395829, 'wQuantileLoss[0.5]': 0.0018513511100208067, 'wQuantileLoss[0.9]': 0.0008547619064553224, 'mean_wQuantileLoss': 0.0011452352545052375, 'MAE_Coverage': 0.07866666666666665}
Note that the MASE in our implementation is the seasonal MASE and therefore not very meaningful for this experiment. MSE and RMSE look low enough to me.
I think this is a classic No Free Lunch
example: The model choices in DeepAR are not great for this synthetic data and the model that would work for this synthetic data likely does not do well with real data :-).
Hope that helps.
Awesome, thank you both! First working with a synthetic dataset allows me to see if I understand the API and the model.
I believed that DeepAR might also be a strong model for processes which are highly correlated to many dynamic features, even if there is weak seasonability. I will continue my experiments :-)
Hello,
To practice with GluonTS, I've built a synthetic dataset to train a simple DeepAR model. Basically, I generate a random walk where each step is drawn from a uniform between -1 and +1.
I make it 10 000 steps long, train a DeepAR with a single layer and 5 cells. Each step is provided to the model as a dynamic feature, so that there is no uncertainty present. Yet, the model is having a bad performance.
This gives me
{'MSE': 0.2924841361204017, 'abs_error': 433.68280267715454, 'abs_target_sum': 23014.57519197464, 'abs_target_mean': 23.01457519197464, 'seasonal_error': 1.0600714333852133, 'MASE': 0.631015149464367, 'sMAPE': 0.02015682482771147, 'MSIS': 21.265574864578436, 'QuantileLoss[0.1]': 493.3332794189453, 'Coverage[0.1]': 0.653, 'QuantileLoss[0.5]': 433.68280267715454, 'Coverage[0.5]': 0.693, 'QuantileLoss[0.9]': 260.0980569839478, 'Coverage[0.9]': 0.731, 'RMSE': 0.5408180249588596, 'NRMSE': 0.023498935802536428, 'ND': 0.01884383261735733, 'wQuantileLoss[0.1]': 0.021435689136290226, 'wQuantileLoss[0.5]': 0.01884383261735733, 'wQuantileLoss[0.9]': 0.011301449399537299, 'mean_wQuantileLoss': 0.017193657051061618, 'MAE_Coverage': 0.305}
The MASE and RMSE seem pretty high to me. They should be as close to zero as possible given that the model has full knowledge.
Did I forget anything? I doubled the number of layers, but while the final loss after training was smaller, the MASE was 3.17 and RMSE was 2.50.
Thank you! :-) Laurent