H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
I was wondering if you could benchmark the training computation time of h2o glm models (logistic regression classifier) for sparse (lots of zeros) vs. dense input matrix? There is big difference in the training time for sparse vs. dense if I use the sparse version of the same training input matrix for R package "glmnet" . I was wondering if the same applied to the h2o glm trainer. I expect the memory usage for sparse to be much lower than dense. I know an input sparse matrix is possible: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/faq/data.html
I was wondering if you could benchmark the training computation time of h2o glm models (logistic regression classifier) for sparse (lots of zeros) vs. dense input matrix? There is big difference in the training time for sparse vs. dense if I use the sparse version of the same training input matrix for R package "glmnet" . I was wondering if the same applied to the h2o glm trainer. I expect the memory usage for sparse to be much lower than dense. I know an input sparse matrix is possible: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/faq/data.html