[Bug:PaperCut:Perf-Issue] With small datasets, Remote AutoML needs a lot more time per run than Local training

This might be because the size of the training datasets is pretty small and then in remote training it might need to deploy Docker containers for the trainings whereas in local training is straightforward and it just trains in a ready-to-go machine/VM?

If the datasets were large, that time needed for Docker containers might be small in comparison to training times...

But this is a papercut for folks experimenting with small downsampled datasets where the end-to-end training in remote compute is too high due to infrastructure time needed (containers?):

Local Training: Total Time: 5.7 minutes versus Remote Training: Total Time: 67 minutes

Basically it is around 5 secs for each child run local training and 1.5 minutes for each remote training.

Local Training: Total Time: 5.7 minutes

01-13-2020-05
classif-automl-local-01-13-2020-05
Running on local machine
Parent Run ID: AutoML_a8a0a27e-6228-481b-bde0-406ec5a6ded0

Current status: DatasetFeaturization. Beginning to featurize the dataset.
Current status: DatasetEvaluation. Gathering dataset statistics.
Current status: FeaturesGeneration. Generating features for the dataset.
Current status: DatasetFeaturizationCompleted. Completed featurizing the dataset.
Current status: DatasetCrossValidationSplit. Generating individually featurized CV splits.

****************************************************************************************************
DATA GUARDRAILS SUMMARY:
For more details, use API: run.get_guardrails()

TYPE:         Class balancing detection
STATUS:       PASSED
DESCRIPTION:  Classes are balanced in the training data.

TYPE:         Missing values imputation
STATUS:       PASSED
DESCRIPTION:  There were no missing values found in the training data.

TYPE:         High cardinality feature detection
STATUS:       PASSED
DESCRIPTION:  Your inputs were analyzed, and no high cardinality features were detected.

****************************************************************************************************
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   MaxAbsScaler SGD                               0:00:04       0.8716    0.8716
         1   MaxAbsScaler SGD                               0:00:05       0.7696    0.8716
         2   MaxAbsScaler ExtremeRandomTrees                0:00:05       0.7220    0.8716
         3   MaxAbsScaler SGD                               0:00:05       0.8801    0.8801
         4   MaxAbsScaler RandomForest                      0:00:05       0.8154    0.8801
         5   MaxAbsScaler SGD                               0:00:05       0.8682    0.8801
         6   MaxAbsScaler RandomForest                      0:00:05       0.7483    0.8801
         7   StandardScalerWrapper RandomForest             0:00:05       0.7228    0.8801
         8   MaxAbsScaler RandomForest                      0:00:06       0.7415    0.8801
         9   MaxAbsScaler ExtremeRandomTrees                0:00:05       0.8478    0.8801
        10   MaxAbsScaler BernoulliNaiveBayes               0:00:05       0.7823    0.8801
        11   StandardScalerWrapper BernoulliNaiveBayes      0:00:05       0.7347    0.8801
        12   MaxAbsScaler BernoulliNaiveBayes               0:00:05       0.7704    0.8801
        13   MaxAbsScaler RandomForest                      0:00:05       0.7152    0.8801
        14   MaxAbsScaler RandomForest                      0:00:05       0.6591    0.8801
        15   MaxAbsScaler SGD                               0:00:05       0.8733    0.8801
        16   MaxAbsScaler ExtremeRandomTrees                0:00:05       0.8503    0.8801
        17   MaxAbsScaler RandomForest                      0:00:05       0.7100    0.8801
        18   StandardScalerWrapper ExtremeRandomTrees       0:00:05       0.7100    0.8801
        19   StandardScalerWrapper ExtremeRandomTrees       0:00:07       0.8478    0.8801
        20   MaxAbsScaler SGD                               0:00:06       0.8478    0.8801
        21   StandardScalerWrapper LightGBM                 0:00:07       0.8656    0.8801
        22   MaxAbsScaler ExtremeRandomTrees                0:00:06       0.8478    0.8801
        23   MaxAbsScaler LightGBM                          0:00:07       0.8741    0.8801
        24   StandardScalerWrapper LightGBM                 0:00:05       0.8665    0.8801
        25   StandardScalerWrapper SGD                      0:00:06       0.8478    0.8801
        26   StandardScalerWrapper LightGBM                 0:00:07       0.8690    0.8801
        27   MaxAbsScaler LightGBM                          0:00:06       0.8554    0.8801
        28   MaxAbsScaler LightGBM                          0:00:06       0.8478    0.8801
        29   SparseNormalizer ExtremeRandomTrees            0:00:06       0.8478    0.8801
        30   VotingEnsemble                                 0:00:15       0.8860    0.8860
        31   StackEnsemble                                  0:00:14       0.8809    0.8860
Stopping criteria reached at iteration 31. Ending experiment.
Manual run timing: --- 341.8196201324463 seconds needed for running the whole LOCAL AutoML Experiment ---

Remote Training: Total Time: 67 minutes

01-13-2020-05
classif-automl-remote-01-13-2020-05
Running on remote compute: cesardl-cpu-clus
Parent Run ID: AutoML_c833d1c3-ce81-43cc-bdaf-a24858744afd

Current status: DatasetFeaturization. Beginning to featurize the dataset.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
****************************************************************************************************

 ITERATION   PIPELINE                                       DURATION      METRIC      BEST
         0   MaxAbsScaler SGD                               0:01:44       0.8232    0.8232
         1   MaxAbsScaler SGD                               0:01:38       0.7830    0.8232
         2   MaxAbsScaler ExtremeRandomTrees                0:01:36       0.7329    0.8232
         3   MaxAbsScaler SGD                               0:01:37       0.8635    0.8635
         4   MaxAbsScaler RandomForest                      0:01:42       0.7990    0.8635
         5   MaxAbsScaler SGD                               0:01:45       0.8581    0.8635
         6   MaxAbsScaler RandomForest                      0:01:41       0.7444    0.8635
         7   StandardScalerWrapper RandomForest             0:01:43       0.7201    0.8635
         8   MaxAbsScaler RandomForest                      0:01:44       0.7481    0.8635
         9   MaxAbsScaler ExtremeRandomTrees                0:01:43       0.8377    0.8635
        10   MaxAbsScaler BernoulliNaiveBayes               0:01:43       0.7610    0.8635
        11   StandardScalerWrapper BernoulliNaiveBayes      0:01:37       0.7003    0.8635
        12   MaxAbsScaler BernoulliNaiveBayes               0:01:37       0.7466    0.8635
        13   MaxAbsScaler RandomForest                      0:01:45       0.6927    0.8635
        14   MaxAbsScaler RandomForest                      0:01:39       0.6981    0.8635
        15   MaxAbsScaler SGD                               0:01:38       0.8612    0.8635
        16   MaxAbsScaler ExtremeRandomTrees                0:01:47       0.8445    0.8635
        17   MaxAbsScaler RandomForest                      0:01:44       0.7307    0.8635
        18   StandardScalerWrapper ExtremeRandomTrees       0:01:46       0.7186    0.8635
        19   MaxAbsScaler LightGBM                          0:01:48       0.8665    0.8665
        20   StandardScalerWrapper LightGBM                 0:01:40       0.8377    0.8665
        21   StandardScalerWrapper ExtremeRandomTrees       0:01:46       0.8377    0.8665
        22   MaxAbsScaler LightGBM                          0:01:35       0.8612    0.8665
        23   MaxAbsScaler LightGBM                          0:01:40       0.8673    0.8673
        24   TruncatedSVDWrapper LinearSVM                  0:01:44       0.8377    0.8673
        25   StandardScalerWrapper LightGBM                 0:01:44       0.8377    0.8673
        26   StandardScalerWrapper LightGBM                 0:01:44       0.8635    0.8673
        27   StandardScalerWrapper LightGBM                 0:01:38       0.8559    0.8673
        28   SparseNormalizer LightGBM                      0:01:38       0.8543    0.8673
        29   MaxAbsScaler LightGBM                          0:01:34       0.8377    0.8673
        30   StandardScalerWrapper LightGBM                 0:01:43       0.8377    0.8673
        31   StandardScalerWrapper LightGBM                 0:01:42       0.8528    0.8673
        32   StandardScalerWrapper LightGBM                 0:01:41       0.8650    0.8673
        33   StandardScalerWrapper LightGBM                 0:01:44       0.8543    0.8673
        34    VotingEnsemble                                0:02:06       0.8764    0.8764
        35    StackEnsemble                                 0:01:52       0.8703    0.8764
Manual run timing: --- 4020.8364148139954 seconds needed for running the whole Remote AutoML Experiment ---

danielsc / azureml-workshop-2019

[Bug:PaperCut:Perf-Issue] With small datasets, Remote AutoML needs a lot more time per run than Local training #31