Closed nikosbosse closed 8 months ago
It seems fairly clear that forecast_binary
, forecast_ordinal
and forecast_nominal
make sense grouped together under the categorical banner (but as yet no meta classes so we don't need forecast_categorical
?
Merging binary and nominal forecasts in one class with one set of scoring rules seems at least possible. I tend to think it's not necessarily desirable though.
Agree we should keep these apart even if under the hood they share infra. Will be confusing for users.
ordered categorical predictions
Aside from different scoring this could be done with the suggested variable structure of forecast_multiclass
right but you would need an additional ordering variable?
ok based on your comments (thanks!) here and on #608, I see the following:
There is a hierarchy:
"multiclass forecasts" seem to be the same as "nominal forecasts" to me, is that right? Maybe there is a small difference in the sense that nominal forecasts comprise binary forecasts, but multiclass forecasts don't comprise binary forecasts and are instead on the same level?
Since what we want at the moment is scoring nominal/multiclass forecasts, I think we should name the class either
forecast_nominal
orforecast_multiclass
We could then add a forecast_ordinal
in the future if we ever wanted to score forecasts for ordered categories (which we could then represent by ordered factors - we should make it clear that for now, we're expecting unordered factor levels).
As discussed in #608, there will be 3 input columns:
predicted
: numeric observed
: factorto_be_named
: factor, denoting the category for which a prediction was made. I started to like predicted_label
as a name. Having predicted_
here makes the relation to the prediction clear. label
I think works well both with "label for a class or category" as well as "label of the factor level for which a prediction was made". Alternative ideas:
predicted_class
predicted_category
predicted_outcome
Again pinging @sbfnk and @nickreich in case you want to weigh in
Can multiclass forecasts include ordinal classification? If yes then I don't think it makes sense to use but it could be used as again as a metaclass for nominal and ordinal?
I think I would marginally prefer forecast_nominal
but I don't mind forecast_multiclass
if others strongly prefer.
I started to like predictedlabel as a name. Having predicted here makes the relation to the prediction clear. label I think works well both with "label for a class or category" as well as "label of the factor level for which a prediction was made".
Seems like a good choice.
(ChaptGPT given "can multiclass forecasts include ordinal forecasts" thinks ordinal forecasts are a subclass of multiclass forecasts)
I also like forecast_nominal
.
ok then I think we'll call the class forecast_nominal
and the additional column predicted_label
, so the input format would be
predicted
: numericobserved
: factorpredicted_label
: factor, denoting the category for which a prediction was made.
The Hubverse elicits categorical forecasts. To integrate scoringutils with their tools, we should have a dedicated class for that and scoring rules for evaluating categorical forecasts. What should this class be named?
In the context of the following overview, we're interested in "soft multiclass prediction". The predicted value is a probability and the outcome is a factor.
This website makes the following distinction between ordered and unordered categories:
Here are some suggestions for the name of the class:
Further thoughts: