mlr-org / mlr3learners

Recommended learners for mlr3
https://mlr3learners.mlr-org.com
GNU Lesser General Public License v3.0
89 stars 14 forks source link

xgboost default nrounds should not be set #261

Open mb706 opened 1 year ago

mb706 commented 1 year ago

I think many users will assume that the "default" values that are given for a Learner are relatively save values that lead to ok performance in many cases. However, the nrounds for xgboost is chosen to be 1 as a "nonsense default", with the recommendation to tune this value (according to the documentation). I think that in this case the value should not be set at all and training without setting it should be an error (which is the default behaviour of xgboost!). Errors should be thrown whenever a user does something that is likely undesired behaviour.

be-marc commented 1 year ago

So your suggestion is to check nrounds first in $.train()? I think that's a good idea. @mllg Didn't we set 1 as the default because the parameter had a required tag? I don't see a required tag anymore, so construction would be possible without setting nrounds.

mllg commented 1 year ago

You could argue the same for all hyperparameters that are sensitive to tuning but have some default. It is not clear to me where to draw the line.

be-marc commented 1 year ago

It is not clear to me where to draw the line.

When the package authors provide no default? XGBoost throws an error when nrounds is not set.

mb706 commented 1 year ago

If a hyperparameter (with reasonably strong impact on performance) does not have a default value in the algorithm / function being wrapped then I would lean towards not setting it in the Learner. If there is no default but we know that some value leads to ok performance most of the time, then we could also set it to that value; I would not insist on this either way here.

The principle I am going by here is: If something is happening that the user likely does not want to happen, then an error should be thrown. We write in our documentation that the given value is "nonsense" and should be tuned, so we seem to be confident that the user does not want to train xgboost with nrounds = 1.

I would usually ignore it when an algorithm has nonsense-defaults itself, in the same way that we usually do not put bugfixes for other packages in our Learner code. I am okay with overriding hyperparameters with little performance impact, and maybe things that we have explicit mlr3 interfaces for (e.g. number of threads, even if it does have an impact on performance).