microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.67k stars 3.83k forks source link

[RFC] [R-package] Replace "info" interface in lgb.Dataset with keyword arguments #4543

Closed jameslamb closed 2 years ago

jameslamb commented 3 years ago

Summary

The following changes should be made to lgb.Dataset() in the R package.

"deprecated" = "supported, but raises a warning if used".

In release 3.3.0 (#4310)

In release 4.0.0

Motivation

Description

LightGBM training involves some preprocessing like bucketing continuous features into histograms and filtering out unsplittable features. That work is done one time before training begins, in the construction of a Dataset object.

In addition to the raw data (i.e. features) used, LightGBM Dataset objects can also contain the following:

References

Other Notes

Sorry I didn't write this up sooner. Didn't really think of it until I started working on adding deprecation warnings for uses of ... (e.g. in #4522).

@Laurae2 and I have already talked about this privately, although would still like to open this as a Request for Comment (RFC) to give everyone who's interested a chance to voice their opinions.

Laurae2 commented 3 years ago

Agree with all the proposed changes, not only this will make it easier to maintain but also make it easier for users to work with. 👍

jameslamb commented 2 years ago

This work is now complete. See the list of linked pull requests above for details.

Thanks very much @StrikerRUS for thorough reviews of so many PRs!

StrikerRUS commented 2 years ago

@jameslamb Thanks a lot for splitting the work into many multiple small PRs! It was a pleasure to review them.

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.