Closed vidarsumo closed 2 years ago
When parallel_processing = 'azure_batch'
each time series is submitted to a separate Azure VM. Then all models (XGBoost + all others that were selected to run) are trained and refitted on that VM. In order to use all available cores on that VM for things like hyperparameter tuning, refitting to get back test results and refitting to create ensemble training data setting run_model_parallel = TRUE
allows those things to run in parallel within that VM.
So if you want to create a global model (one XGBoost for all the time series), you would not set parallel_processing = 'azure_batch'
right?
Global models run on their own VM in azure batch. So you can set parallel_processing = "azure_batch"
and Finn will have global models run in azure on a VM, similar to how each time series runs on their own VM.
I was reading the article on parallel processing where it says that setting
run_model_parallel = TRUE
each time series is run separately. Using Azure this would mean one VM per time series.Do I understand it correctly, that you are effectively creating model per time series? So if I have 1000 related time series (e.g. from a retailer) and you want to use, say, XGBoost, you would tune XGBoost 1000 times, one time per time series?
Or is
run_model_parallel = TRUE
to be used for univariate models only when you have many time series andrun_model_parallel = FALSE
when you have one global model, like XGBoost?Btw, I have a good feeling about this package, still exploring :)