dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.01k stars 8.69k forks source link

Voting Parallel Learner #10561

Open terraflops1048576 opened 1 month ago

terraflops1048576 commented 1 month ago

LightGBM implements a voting-parallel tree learner to reduce the communication overhead between nodes for datasets with a large number of features. Currently, I'm working on a project that requests on the order of 2000 features, and we've found that, even with NCCL, the communication is a major component of the fitting time, especially when one scales to more than one machine of 8 GPUs. Is there any plan to support the two-round voting system proposed in the paper?

Currently, XGBoost supports the data-parallel and feature-parallel learning through the data_split_mode in DMatrix.

Any pointers to the code or a rough implementation plan would also be appreciated, as I'm not familiar with this codebase.

LightGBM-a-communication-efficient-parallel-algorithm-for-decision-tree.pdf

trivialfis commented 1 month ago

Thank you for opening the issue. Yeah, we thought about supporting it, but so far we have been focusing on scaling with the number of samples instead of features.

Any pointers to the code or a rough implementation plan would also be appreciated, as I'm not familiar with this codebase.

I'm not familiar with the details of the algorithm yet. Will look into it and see how difficult it's.