Open AFg6K7h4fhy2 opened 1 month ago
(Hubverse submission dataframe → ScoringUtils ready dataframe) not deemed a priority?
Noting you can do this via HubEval
if you want
Noting you can do this via
HubEval
if you want
Hadn't seen; thank you, Sam.
Had been thinking mostly in terms of something like the following:
data = {
"location": ["DE", "DE", "AL", "AL"],
"forecast_date": ["2021-01-01", "2021-01-01", "2021-07-12", "2021-07-12"],
"target_end_date": ["2021-01-02", "2021-01-02", "2021-07-24", "2021-07-24"],
"target_type": ["Cases", "Deaths", "Deaths", "Deaths"],
"model": [None, None, "epiforecasts-EpiNow2", "epiforecasts-EpiNow2"],
"horizon": [None, None, 2, 2],
"quantile_level": [None, None, 0.975, 0.990],
"predicted": [None, None, 611, 719],
"observed": [127300, 4534, 78, 78]
}
# convert data to pl.DataFrame, then to forecasts_to_score.parquet
Then in R, something akin to
df <- read_parquet("forecasts_to_score.parquet")
forecast_quantile <- df |>
as_forecast_quantile(
forecast_unit = c(
<insert col names>
)
)
Would appreciate an examination of this workflow by @SamuelBrand1 @dylanmorris .
There are still likely considerations for ScoringUtils 2.0 that need to be accounted for in this PR.
Also, this PR partially depends on the utilities featured in #34 .
This depends on #30 and #28 .
The scope of this PR includes convert a forecast
idata
with time representation to aScoringUtils
-indigestible parquet file.