Open david-cortes opened 2 years ago
I don't think any current active maintainers are big R users so we welcome input. Could we just build a new interface behind a different namespace until it's ready? I don't think there's a need to immediate replace the old interface in a short space of time.
Would this project accept big breaking PRs for the R interface (particularly for xgboost() and predict.xgb.Booster()) for the 2.0 release that would make it more similar to base R and other R packages?
I would like to welcome these changes. The concern about breaking changes can be handled by running reverse dependency checks.
I suggest to keep xgboost()
and predict()
as they are and instead call the new functions differently, e.g. xgboost2()
and predict2()
. Too much code would break when changing the main functions.
Otherwise, great work @david-cortes.
I notice that there is a version 2.0 of xgboost in the plans, which among other things, is expected to include support for categorical features in the R interface.
Given that this is a major version release and as such is expected to introduce potentially breaking changes, I think this is a good opportunity to make the R interface more in line with base R and core/popular R modeling packages. Many people (including myself) find the R interface of xgboost to be inconvenient and unidiomatic, but changing the interface for
xgboost()
from its current state would be a rather big breaking change and would probably break lots of user scripts that depend onxgboost()
.In short,
xgboost()
does not work with the most common data types used in R (data.frame
) and does not follow R conventions in terms of e.g. function arguments. For people who are familiar with base R and with other R packages, there are many ways in which the R interface of xgboost could be improved for a better end-user experience, such as:factor
variables as "y".type
argument.weights
instead ofweight
, like base R does.Among many others.
Would this project accept big breaking PRs for the R interface (particularly for
xgboost()
andpredict.xgb.Booster()
) for the 2.0 release that would make it more similar to base R and other R packages?