dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
25.79k stars 8.69k forks source link

subsampling for gblinear booster #5379

Open zl365 opened 4 years ago

zl365 commented 4 years ago

In the documentation, subsample is a parameter appears in the list of parameters for tree boosters only. Please include subsampling for gblinear booster as well. Subsampling of training instances can be very useful for linear boosters, especially there are a large number of predictors. It probably only need a small modification after the gradients are calculated.

I have tried to set the subsample parameter at different values with a gblinear booster, and I got same results (estimated parameter values and predicted values). So perhaps, subsampling only work with tree boosters in current version of xgboost. If I am mistaken, could anyone please tell me how to implement subsampling of training instances with gblinear booters correctly, thank you very much.

trivialfis commented 4 years ago

Subsampling is currently not supported by linear booster.