Open jameslamb opened 2 weeks ago
@guolinke @shiyu1994 @StrikerRUS @jmoralez @borchero @btrotta please let me know what you think whenever you have time
I'm +1 for dropping support of datatable. Especially given that so called "support" is simple .to_numpy()
method call 🙃
Thank you for the ping. Sounds good to me considering that there's no new commit to the project now.
I'm +1 as well
I am +1
I'm in favor of removing as well ✅
Thank you all for the quick responses! I'll put up a PR adding a deprecation warning.
Summary
Support for the h2o's
datatable
library was added to LightGBM 5.5+ years ago, in #1970.Proposing here that
lightgbm
:datatable
is useddatatable
support 2-3 releases from nowMotivation
That project seems to be abandoned:
H2OFrame
in theh20-py
package:In those 5.5 years since #1970, the only bug reports / feature requests received about
datatable
support have been from one person working for h2o... and the last of those was 4 years ago:And in all that time, I don't think we have ever tested against
datatable
in CI.Description
Doing this would simplify the Python package, making it easier for others to contribute.
It'd also make it more manageable to add support for newer, more popular input formats like
polars
(#6204).See @trivialfis's summary of the current state of supporting data frame libraries at https://github.com/dmlc/xgboost/issues/10554#issuecomment-2211824457 ... I agree with it.
References
I am not proposing here that
lightgbm
should supportH2OFrame
... Dask doesn't, XGBoost doesn't,scikit-learn
doesn't... and I think our limited time and attention here would be better spent on more widely-used input formats, likepolars
.