Closed fredn19 closed 4 months ago
Hello @fredn19,
Thanks for opening the topic. You are right, normally you would expect less execution time for a smaller number of refits.
The reason why the time for refit=True
was 2 min and for refit=7
was 30 min is parallelization. While refit=True
allows parallelization, when refit is an integer other than 1, parallelization is not possible because the various model fits must follow a logical order. This configuration is automated thanks to n_jobs='auto'
.
What I can suggest to reduce time when refit=7
:
n_boot
. Probably a number around 200 will give you similar results.By the way, you might be interested in reading this article about metrics when building a forecasting model:
https://towardsdatascience.com/forecast-kpi-rmse-mae-mape-bias-cdc5703d242d
Hope it helps!
Javier
Thank you so much for the explanation Javier!
It makes way more sense to me now :-)
I also see your point about the metric, thanks for the heads up!
I'm currently working on a project in which i am trying to predict the hourly energy consumption of a household. Im trying to predict 24 hours ahead but with af 12 hour gap so the prediction time is actually 36 hours.
However as im looking to predict an entire year on hourly basis im playing around with the refit parameter. Here i have found that using refit=1, meaning that the model refits every 24 hours, is way faster than using refit=7, meaning that the model refits once a week. This does not make sense to me and am therefore interesting in hearing if have understood the mechanisms correctly?
The difference in prediction time for a year is like 2 min for refit = 1 (True) and 30 min for refit = 7
Here is my backtester:
And here is the forecaster: