Use `nfolds` argument in `h2o.deeplearning()`

ledell commented 7 years ago

Hi, I came across your code while searching for h2o projects and I took a look at fnc-deeplrn.R. I wanted to let you know that you don't need to do cross-validation manually if you don't want to -- it's built in to H2O. All you have to do is use the nfolds argument. If you want to keep the cv predictions, set keep_cross_validation_predictions = TRUE and a nx1 frame storing the cross-validated predictions will be created and accessible in the model object. The cross-validation models will also be stored if you need access to them for some reason. If you want to control how the folds are generated, take a look at the fold_assignment argument.

library(h2o)
h2o.init()

fit <- h2o.deeplearning(x = 1:4, 
                        y = 5, 
                        training_frame = as.h2o(iris), 
                        nfolds = 5, 
                        keep_cross_validation_predictions = TRUE, 
                        seed = 1)

cvpreds <- h2o.getFrame(fit@model$cross_validation_holdout_predictions_frame_id$name)

Hope this is helpful!

iaolier commented 7 years ago

Hi Thanks for your contribution. I was aware H2O has a built-in crossvalidation. I didn’t use it as I was using the same splits across many algorithms, some of them implemented on H2O, but others, using different R packages. By doing cv manually, I had 100% guarantee all the algorithms were using the same splits. Cheers Ivan

On 22 Aug 2017, at 18:31, Erin LeDell notifications@github.com wrote:

Hi, I came across your code while searching for h2o projects and I took a look at fnc-deeplrn.R https://github.com/meta-QSAR/rmetaqsar/blob/master/fnc-deeplrn.R. I wanted to let you know that you don't need to do cross-validation manually if you don't want to -- it's built in to H2O. All you have to do is use the nfolds argument. If you want to keep the cv predictions, set keep_cross_validation_predictions = TRUE and a nx1 frame storing the cross-validated predictions will be created and accessible in the model object. The cross-validation models will also be stored if you need access to them for some reason. If you want to control how the folds are generated, take a look at the fold_assignment argument.

library(h2o) h2o.init()

fit <- h2o.deeplearning(x = 1:4, y = 5, training_frame = as.h2o(iris), nfolds = 5, keep_cross_validation_predictions = TRUE, seed = 1)

cvpreds <- h2o.getFrame(fit@model$cross_validation_holdout_predictions_frame_id$name) Hope this is helpful!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/meta-QSAR/rmetaqsar/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AHIIugQDI6bAREYA0bWgVI6Bafi4ely7ks5saxCLgaJpZM4O-8hA.

ledell commented 7 years ago

Oh, in that case you can add a column to your training frame that contains the desired fold index for each row, and specify fold_column argument. If the name of the column is "fold_index", then you'd set fold_column = "fold_index". That way you can still enjoy the benefits of H2O internal CV, but you can use custom folds.

meta-QSAR / rmetaqsar

Use `nfolds` argument in `h2o.deeplearning()` #1