Closed edgBR closed 4 years ago
Not sure off of the top of my head. My guess is that it's related to the size of the datasets being passed around. The train_model()
function is a fairly thin wrapper. I'll take a look at the future.apply
package, which is what I'm using internally, and see if things could be sped up with respect to explicitly identifying and passing global objects internally.
Hi Nick,
Thanks for your great support. If you also know another way of tuning the hyperparameters that is working with decent speed I am all ears.
BR /Edgar
Hi Nick I found a very non standard way to solve this problem.
I wrapped my training code into a docker image and push it to Amazon ECR.
I used the BYOM functionality of AWS Sagemaker to train my models at scale.
I modified my training code to read the hyperparameters from a JSON file that Sagemaker mounts in /opt/ml/directory
I parsed my hyperparameters to the estimator object and after to the tuner object.
I launched 50hpo jobs with early stopping in single M5 large machines and it finished in 2 hours.
If my company allows it I will update the example notebooks.
BR /Edgar
Closing the issue.
Disclaimer: if you do not have sagemaker or any other tool that allows you to perform HPO in parallel (ex. katib) you will not be able to implement my solution
BR /Edgar
Dear Nick,
I am trying to wrap my caret hyperarameter tunning in the forecastML model function as follows:
Usually when use caret you can specify the parallel training for your train/test splits using the allowParallel in the fit control function but you can also set the nthread parameters for xgboost.
If would like to also do hyperparameter tunning I probably will do the folllwing:
The only problem that I am facing is that the training is taking ages. I am not aware if this is because the future library is conflicting with caret or with the nthread parameter or if this is something else (I am using 28cores btw).
Have you ever experienced a similar situation?
BR /Edgar