h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

XGBoost multithreading without GPUs #6829

Open wendycwong opened 1 year ago

wendycwong commented 1 year ago

I am trying to use H2OXGBoostClassifier with h2o-pysparkling-3.1 on a multi-node cluster. In the logs, I see messages like the following: WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)! I have tried installing libgomp, gcc_linux-64, and llvm-openmp, but I am unable to get multicore utilization during model fitting. I'm not a Java expert, but it looks like there's no support for multithreading unless the underlying resource has GPU capabilities. Is this correct? Am I missing something that would enable multithreading? Thank you for taking a look!

JIRA link: https://h2oai.atlassian.net/browse/PUBDEV-9076

wendycwong commented 1 year ago

GH link: https://github.com/h2oai/h2o-3/issues/6829

Please refer to GH link for progress.