Closed RaphaelS1 closed 11 months ago
isn't this a duplication? https://github.com/mlr-org/mlr3proba/issues/34
Possibly. Unsure. It isn't yet clear whether we would need to deliberately duplicate measures as they may require a particular prediction type.
Ah, I see - the prediction type, or even the method by which the distribution is queried, may be different.
I see how the current design may necessitate separating the "the log-loss for density estimation" from "the log-loss for proba. supervised regression".
From an architectural modelling standpoint, however, this seems very counterintuitive and strange, as the mathematical objects (the loss functions) are the same - and losses defined for the one case automatically apply to the other.
Any good ideas how this could be resolved without major surgery?
I actually don't think it's a problem. If you look at how measures are named, e.g. surv.graf
, classif.mmce
; then it seems sensible to have surv.logloss
, density.logloss
, regr.logloss
(one can always inherit from the other - efficient although maybe slightly messy)
The conceptual pain that this causes me is that classif.bla
and regr.blubb
represent, usually, different mathematical objects. Whereas density.logloss
and regr.logloss
would represent the same mathematical object - the log-loss for absolutely continuous distributions over the reals.
The difference is entirely localized in "accessory functionality" in application to the outputs of the estimation strategy, not in the object that it represents - which disturbs me, since it seems to violate an uncodified "same mathematical object, same class" principle.
There's no reason why someone can't use regr.logloss
on PredictionDensity
as long as they both have the same predict type. But this will just be confusing to the user.