epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
48 stars 20 forks source link

Define input format for categorical forecasts #608

Open nikosbosse opened 7 months ago

nikosbosse commented 7 months ago

As mentioned in #604, we would like to be able to score categorical/multiclass forecasts (or whatever the name should be, see #607. What should the expected format be?

Data.frame

image

Are we fine with the following? columns predicted, observed, predicted_class, with both observed and predicted_class being factors with the same levels.

Alternative names instead of predicted_class:

Vector format

image
seabbs commented 7 months ago

All seems reasonable. I have only come across these discussed as categorical forecasts but I don't have a strong feeling that that is the right way to go.

seabbs commented 7 months ago

Had a chat with ChatGPT. Point about categorical being a good overall name and multiclass being useful to distinguish between binary and non-binary seems like a good one.

Prompt

Multiclass forecasts and categorical forecasts are related concepts but not identical.

Answer

  1. Multiclass Forecasts: These refer to prediction tasks where the outcome can belong to one of multiple classes. The key point is that each instance (or observation) is classified into exactly one out of several possible categories. Common in machine learning, examples include image classification where each image is categorized into one class (like 'cat', 'dog', 'bird').

  2. Categorical Forecasts: This term is more general. It encompasses any forecasting where outcomes are categories rather than numerical values. This includes multiclass situations but can also refer to binary classification (where there are only two categories) or even scenarios with unordered categories (like predicting types of cuisine).

As for which is more commonly used, it depends on the context:

The choice between them typically depends on the specific requirements of the task at hand. In some scenarios, the distinction might not be significant, while in others, the specific nature of the categories can greatly influence the modeling approach and evaluation metrics used.

nikosbosse commented 7 months ago

ok sounds good. Then I suggest the following.

n: number of observations, N: number of possible categories of the outcome

The data.frame input format will be

The vector/matric format will be

I also suggest to move the naming of somename to #607

Pinging @nickreich and @sbfnk in case you want to weigh in

nikosbosse commented 7 months ago

@nickreich just raised a good point: Do we want to enforce N rows for every forecast? Say you're predicting who wins the US presidency. You have 30 candidates, but you only assign a probability > 0 to 6 of them. Do you then have to have 24 rows with zeros?

I can see several options:

Noting that in the vector/matrix paradigm we have some kind of implicit enforcement anyway: the prediction matrix has to have rectangular shape. (though in the above example, you'd end up with a nx20 matrix, even though your factor had 30 levels and then the function would have to decide whether to take its N from the number of factor levels or from the dimensions of the prediction).

I'm personally leaning slightly towards strict enforcement + helper function to get there from a more liberal format that omits rows with a predicted probability of 0. What do others think? Also pinging @elray1 in case you have thoughts

seabbs commented 7 months ago

I'm personally leaning slightly towards strict enforcement + helper function to get there from a more liberal format that omits rows with a predicted probability of 0.

Yes I think this makes sense. Potentially could run this for people within the as_forecast method but maybe not if its overly complicated.