Future of `merge_pred_and_obs()`

nikosbosse commented 5 months ago

This is currently an exported function. I'm somewhat on the fence: On the one hand, functionality like this is useful and the function does some clever checking (or rather it felt like clever checking when I coded this monster...). On the other hand, I haven't used this function much so I don't really know how robust it is.

I think that it is or was used as part of some of the forecast hub stuff, as they had forecasts and predictions separated.

I see the following options going forward:

we drop the function to reduce the maintenance burden and instead move the code to a "data wrangling" vignette + possibly reintroduce it later.
we keep the function and a. give it a new, not-terrible name b. add more tests to check it thoroughly c. update its documentation. (At the moment the docs are a bit like "here's this experimental function, maybe you find it useful, yolo")

Re 1. there is probably a broader question related to all other data-wrangling functions (e.g. converting from one format to the other). These functions, for example, are missing from the current version of the manuscript and we haven't fully thought about which functions to have and what the workflows should look like. (I think at some point we definitely want to have functions like this one, the question is more do we want to commit to it now).

sbfnk commented 5 months ago

we drop the function to reduce the maintenance burden and instead move the code to a "data wrangling" vignette + possibly reintroduce it later.

personally I think if we explain well what format we expect we can leave the wrangling to other packages - I would always just use dplyr or data.table functionality for this and not rely on your magical code.

That said, do we actually explain the expected data format somewhere? The documentation seems somewhat circular? https://github.com/epiforecasts/scoringutils/blob/1a51fa8c173559d07a443650f8350d73cfe48f04/R/forecast.R#L23

nikosbosse commented 5 months ago

That said, do we actually explain the expected data format somewhere? The documentation seems somewhat circular?

There is a newer version of the docs waiting for review in #747 (your unique chance to get a sneak preview in return for reviewing only 120 changed files! 🚀 🙀). I changed that to "the details section of as_forecast()" - the reason it's circular is that we reuse that @param description in other functions. But the new version has the table from the paper and is really neat when you call ?as_forecast()!

seabbs commented 5 months ago

I would always just use dplyr or data.table functionality for this and not rely on your magical code.

Yes I agree but also note that we aren't really the target users here. That being said i am minded to kill it.

epiforecasts / scoringutils

Future of `merge_pred_and_obs()` #754