h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.91k stars 2k forks source link

XGBoost using GPU library in GPUless environments #8258

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

While using H2O's XGBoost to evaluate we noticed that there were some environments where the algorithm would perform very badly compared to others.

Using the same model we observed that, to evaluate the same dataset, it would take 25 minutes in one environment and 1 and a half minute in another. You could say that this might be because fastest machine was much more powerful than the other, but this is exactly the opposite, the machine that took the 25 minutes was the most powerful one.

We then noticed that the machine that performed worst was using the libxgboost4j_gpu.so native library, while the machine that was performing better, was not able to load it and was using the libxgboost4j_minimal.so native library.

After making the GPU library unavailable in the slowest machine it started having equivalent performance to the fastest machine.

This made us realize that maybe you shouldn't be loading the GPU library when there is no GPU available. Our guess is that the GPU library has really low performance when it runs in CPU.

For now, we're just forcing XGBoost to load the Minimal library, but we think it would be better to improve you library loading according to the environment that it's running, because it kept loading the GPU library in an environment that did not have GPU.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-7379 Assignee: New H2O Bugs Reporter: Miguel Cruz State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A