H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
First step: all features (num+cat) are hashed into D dims
Second step: bin numerical columns into N buckets with h2o.cut()
Third (optional) step: add interactions between selected columns (on the fly)
Implement quick & dirty single-node version of FTRL with L1 regularization and adaptive learning rate: https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf
First step: all features (num+cat) are hashed into D dims Second step: bin numerical columns into N buckets with h2o.cut() Third (optional) step: add interactions between selected columns (on the fly)