Open adarwade1 opened 4 years ago
@adarwade1 What exactly do you mean by peaks? It sound like you are talking about outliers, i.e. a few observations which are far away from the usual level. In this case thera are two options:
Please refer below details. -Objective : Predict high peaks Currently we are working for one of the postal services customer. There are multiple post offices and further each post offices having multiple sorting processes to sort different kind of letters.
In case of one category of letters, the volume will be in chunk on one of the day (third or fourth week of each month). Here requirement is instead of ignoring such peaks (chunk size ) , model should forecast the same high peaks.
We used "week of month" as one of the feature however that also not helping to predict values closer to peak.
Can you please advise on how we can able to predict high peaks correctly using GluonTS
Please let me know if you need any further details. GluonTS Simple Feed Forward model is able to forecast other values with better accuracy however there is big difference in actual vs forecasted values in case of peaks.
@adarwade1 My first guess is, that you need additional features. From your description I suggest features like weekdays, public holidays and other features which are informative about the peaks.
@adarwade1 is DeepAREstimator
able to understand the presence of peaks in any way (even if not to a great accuracy)? Currently the SimpleFeedForwardEstimator
does not use any date-related feature, so if those are in any way informative of peaks, maybe adding them would be beneficial as @kaijennissen suggests
We used most of date features to handle seasionality and trends.
time_feat: -day of week -day of month -week of month -week of year -season_number
feat_dynamic_real: -is weekend -is holidays -is non production days.
feat_static_cat:
Here main challenge is peaks range of value is too high as compare to mean.
GluonTS simple feed forward able to process 1240 time series at time. The algorithm is consistent over different data patterns including latest data.
The output is used for resources planning. The day on which more volume requires more resources. That's the reason we need to forecast peaks with better accuracy.
@adarwade1 The open question is whether it is possible to forecast the peaks (follow some learnable pattern) or not (peaks are random). As @lostella pointed out, it would be helpful to know if the DeepAREstimator
is able to detect the peaks and struggles just with the magnitude of the peaks or if peaks are ignored.
Could you share a plot of one example?
Sure. Let me train DeepAR and share you summary
Hi All,
Apology for delay in response as busy with some deliverable. Please refer below sample data
-High peaks -SFF
SFF actual vs forecast
DeepAR actual vs forecast results
The last one is the DeepAR results. Magnitude of peaks are not handle by both DeepAR and SFF.
@kaijennissen /@lostella , Can you will be able to join MS Teams meeting if comfortable based on your availability.
Many Thanks for guidance. We will check on and update you on the same.
We used below DeepAR hyperparameters. The NegativeBinomialOutput distribution throws nan errors in the first or second epoch. The data doesn't contains nan values, also tried reducing learning rate however not help with NBO distribution.
We are also using latest mxnet version as suggested in articles.
It work fine for default distribution
Can you please provide your suggestions on suitable hyperparameters. As per review, DeepAR is working fine for most of scenarios with best accuracy. We feel that their might be some mistake from our side.
@adarwade1 Plots look like I expected. DeepAR/SFF do pick up the weekly seasonality but there is currently no feature which explains the peaks. I think your problem is to correctly predict the timing of these peaks. Maybe the methods described in this paper and the corresponding code are helpful.
Ans: These high magnitude peaks present in the 3/4 week of months. So we introduce month of week feature. SFF shown some improvements in the accuracy with inclusion of this feature.
@adarwade1 is
DeepAREstimator
able to understand the presence of peaks in any way (even if not to a great accuracy)? Currently theSimpleFeedForwardEstimator
does not use any date-related feature, so if those are in any way informative of peaks, maybe adding them would be beneficial as @kaijennissen suggests
Can you please share any documents/links which brief on recommendations on DeepAR hyperparameters/ best guidelines if handy.
We gone thru so many articles however not able to locate any detail doc.
I see that some time series in your plots exhibit this spiky behaviour, while others do not. @adarwade1 one thing to try is then to train an estimator only on data that has spikes, and verify that the model can learn that behaviour: that should be the case, if the spikes follow some predictable pattern.
@lostella /@kaijennissen ,
We train/validates model with same data and seed values however everytime we observed different accuracy.
We used below seed values mentioned in the documents. mx.random.seed(0) np.random.seed(0)
Can you please advice on the same.
Thanks and Regards, Abhijit
I observed similar Github issue Not able to achieve reproducible training and resultd#1040
I am using CPU with One Azure Machine learning compute instance (single vm with good configuration)
@adarwade1 in #1040 I posted a snippet with which I'm getting consistently the same exact results every time, if I'm not wrong. Could you try that?
Also, to avoid confusion, feel free to open a separate issue about this, given that it is unrelated to the original post here.
@lostella, Handling of high peaks , please let us know your availability so that we can connect.
Thanks and Regards, Abhijit
@lostella, We tried our best however DeepAR is not able to handle high peaks...Can you please suggest any hyperparameters that can help
@adarwade1 Have you considered the case that this is not a problem of the DeepAR Model but of your specific problem?
Currently I am using GluonTS simple feed forward and DeepAR for forecasting problem.
GluonTS SFF work better and consistent than DeepAR.
We need your advise how to handle high peaks in the Simple Feed Forward. Just to give you idea, peaks vs normal values having difference between 50,000-80,000.
The normal values (mean) is in the range of 100-1000 while peaks are in the range of 30,000-80,000
Note: Facebook prophet having changepoint detection hyperparameter. Is there any such hyperparameter available as part of GluonTS Simple Feed Forward or DeepAR