Open DilumAluthge opened 3 years ago
Here is our implementation of the RMSE:
@cscherrer Any thoughts on a good loss function for the multinomial classification problem? Some options include:
Any other options?
Either of those would be good, or an asymmetric loss could be interesting. I'd think this must come up a lot in medical applications, right?
Yeah in binary classification problems (e.g. mortality prediction), we often want to use a loss function that e.g. penalizes underprediction more than overprediction.
I think for the multinomial example, we can just use something simple and symmetric. Then later we can add a binary class problem with class imbalance, and then we can think about some kind of asymmetric loss function for that problem.
Let's go with the Brier score. For consistency with MLJ, we should implement it the same way they do (https://github.com/alan-turing-institute/MLJBase.jl/blob/5e5d1cda3b555510df1de4b125a5e320c11f6256/src/measures/finite.jl#L103-L131):
""" BrierScore(; distribution=UnivariateFinite)(ŷ, y [, w]) Given an abstract vector of distributions
ŷ
of typedistribution
, and an abstract vector of true observationsy
, return the corresponding Brier (aka quadratic) scores. Weight the scores usingw
if provided. Currently onlydistribution=UnivariateFinite
is supported, which is applicable to superivised models withFinite
target scitype. In this case, ifp(y)
is the predicted probability for a single observationy
, andC
all possible classes, then the corresponding Brier score for that observation is given by2p(y) - \\left(\\sum_{η ∈ C} p(η)^2\\right) - 1
Note thatBrierScore()=BrierScore{UnivariateFinite}
has the aliasbrier_score
. Warning. HereBrierScore
is a "score" in the sense that bigger is better (with0
optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ, in that case, by a factor of two from usage elsewhere. For more information, runinfo(BrierScore)
. """
I think this is blocked by https://github.com/cscherrer/SossMLJ.jl/issues/93
Once https://github.com/cscherrer/SossMLJ.jl/issues/93 is solved, I can just get the prediction for μ
in the form of particles. Once I have the particles for μ
, I can just put that directly into the formula for the Brier score.
Sounds good
We currently have an example of a loss function for regression models. Specifically, we implement the root mean squared error.
However, we don't currently have an example of a loss function for classification models.
We need to: