Closed szilard closed 8 years ago
Random forests n=1M, 500 trees: h2o 2.8: time 600s RAM 5GB AUC 75.5 https://github.com/szilard/benchm-ml/blob/master/2-rf/4-h2o.R h2o 3.0: time 450s RAM 5GB AUC 73.4 https://github.com/szilard/benchm-ml/blob/master/2-rf/4-h2o-v3.R AUC is lower in 3.0 cc: @arnocandel
GLM n=10M: h2o 2.8: time 5s RAM 3GB AUC 71.0 https://github.com/szilard/benchm-ml/blob/master/1-linear/4-h2o.R h2o 3.0: time 25s RAM 4GB AUC 71.1 https://github.com/szilard/benchm-ml/blob/master/1-linear/4-h2o-v3.R Run time is larger
GLM: @arnocandel says GLM is slower because "the models now also compute training/validation metrics such as AUC while building the model".
h2o 3.0.0.16: time 600s RAM 5GB AUC 75.2 (AUC better) https://github.com/szilard/benchm-ml/blob/master/2-rf/4-h2o-v3.R
GBM n=1M learn_rate = 0.1 max_depth = 6 n_trees = 300 (experiment B in main README) h2o-2: time 60s RAM 5GB AUC 74.3 h2o-3.0.0.16: time 40s RAM 10GB AUC 75.1 (+++)
learn_rate = 0.01 max_depth = 16 n_trees = 1000 (experiment A in main README) h2o-2: time 900s RAM 9GB AUC 75.9 h2o-3.0.0.16: time 900s RAM 10GB AUC 76.0
n=10M learn_rate = 0.01 max_depth = 20 n_trees = 5000 nbins=1000 h2o-2: time 7.5hrs AUC 79.8 h2o-3.0.0.16: time 9.5hrs AUC 81.2 (+++)
Spot check runtime/AUC/RAM for linear and RF for at least 1 size.