mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
927 stars 86 forks source link

Featureless with response_type = prob should predict outcome mean #911

Closed ck37 closed 1 year ago

ck37 commented 1 year ago

Hello,

The classif.featureless learner predicts the most common class when response_type = "prob" but it should predict the outcome mean (i.e. the prevalence, or a logistic regression with only an intercept term). This will give the right results for net benefit analysis for example. As it stands, if 0 is the most common binary label, the learner says all observations have a 0% probably of being 1, which is miscalibrated compared to the outcome mean.

The relevant lines are: https://github.com/mlr-org/mlr3/blob/main/R/LearnerClassifFeatureless.R#L78-L89

I can do a PR if it would help.

Thanks, Chris

mllg commented 1 year ago

Thanks for opening the issue.

I've changed the predicted probabilities for the default method "mode" accordingly and also improved the docs. If I've missed something, please reopen.