jinlow / forust

A lightweight gradient boosted decision tree package.
https://jinlow.github.io/forust/
Apache License 2.0
56 stars 6 forks source link

Considering sampling data to determine cuts for bins #17

Open jinlow opened 1 year ago

jinlow commented 1 year ago

Currently in the gradientbooser fit method, all of the data is used for determining cuts for binning the data. It would like speed things up, if we allowed for a sample to be used for the initial bin. This could be a parameter such as initial_bin_sample_size or something like that.