Closed kathsherratt closed 2 years ago
In any case since you are scoring both ensemble and the new model from scratch presumably this doesn't affect the conclusions in the paper about relative comparisons.
It looks like there are 63 anomalies for incident cases in the forecast period. Maybe we could remove these with some code like the below? I'm not sure where this is best placed in the code but was looking at the merge_forecasts_with_truth()
function.
# Remove anomalies
anomalies <- fread("https://raw.githubusercontent.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/main/data-truth/anomalies/anomalies.csv")
anomalies <- anomalies[target_variable == "inc case"]
forecasts_with_truth <- forecasts_with_truth[!anomalies,
on = c("location", "location_name", "target_end_date")]
Thanks this is a good point. I couldn't work out what current practice on this was.
It doesn't though I imagine it may drive some of the extreme forecast differences (likely given the error model). IO would be keen to update to account for the forecast anomalies and add a comment in the methods and discussion raising this point.
I think perhaps it should have its function and also its own data extraction to get the anomalies from the hub.
To clarify are you just removing them when they occur in the forecast horizon or also removing forecasts for which anomalies occur on for example the day of forecast?
This appears to be a pretty big chunk of the available forecasts for some places (i.e Lithuania)
12: 2022-01-22 LT Lithuania large data revision
13: 2022-01-29 LT Lithuania large data revision
14: 2022-02-05 LT Lithuania large data revision
15: 2022-02-12 LT Lithuania large data revision
16: 2022-02-19 LT Lithuania large data revision
17: 2022-02-26 LT Lithuania large data revision
18: 2022-03-05 LT Lithuania large data revision
19: 2022-03-12 LT Lithuania large data revision
20: 2022-03-19 LT Lithuania large data revision
21: 2022-03-26 LT Lithuania large data revision
22: 2022-04-02 LT Lithuania large data revision
23: 2022-04-09 LT Lithuania large data revision
24: 2022-04-16 LT Lithuania large data revision
25: 2022-04-23 LT Lithuania large data revision
26: 2022-04-30 LT Lithuania large data revision
27: 2022-05-07 LT Lithuania large data revision
28: 2022-05-14 LT Lithuania large data revision
29: 2022-05-21 LT Lithuania large data revision
30: 2022-05-28 LT Lithuania large data revision
31: 2022-06-04 LT Lithuania large data revision
32: 2022-06-11 LT Lithuania large data revision
33: 2022-06-18 LT Lithuania large data revision
34: 2022-06-25 LT Lithuania large data revision
I've updated to drop anomalies from scoring (and only scoring so far) here: https://github.com/epiforecasts/simplified-forecaster-evaluation/commit/623c8994ed19fc2edb1c0da9fa002c4a35e90e01
Will read through the paper and check the results interpretation stands. Work to do close this is:
To clarify are you just removing them when they occur in the forecast horizon or also removing forecasts for which anomalies occur on for example the day of forecast?
Both - we're removing any truth data which has anomalies and any forecasts made when there was an anomaly https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe/blob/39823425e9ea5d66c3dc0e7a55fa7ba5433d7df2/code/evaluation/load_and_score_models.r#L30
I agree that it would be worth adding this to the documentation.
This appears to be a pretty big chunk of the available forecasts for some places (i.e Lithuania)
Yes because the whole data set was revised on 30 June
library("readr")
library("dplyr")
library("ggplot2")
source <- "JHU"
x <- "Cases"
owner <- "epiforecasts"
repo <- "covid19-forecast-hub-europe"
path <- paste(
"data-truth", source,
paste0("truth_", source, "-Incident ", x, ".csv"), sep = "/"
)
shas <- list(
revision = "a0851dc3ca5fbc631207afe03d24c694c8e51461",
original = "76375ce9c868e0231a2db1dfcb55f29c51050888"
)
data <- lapply(shas, function(sha) {
readr::read_csv(
URLencode(
paste(
"https://raw.githubusercontent.com", owner, repo,
sha, path, sep = "/"
)
),
show_col_types = FALSE
)
}) |>
bind_rows(.id = "status") |>
filter(location == "LT") |>
mutate(week = lubridate::floor_date(date, "week", 7)) |>
group_by(week, status) |>
summarise(value = sum(value), .groups = "drop") |>
ungroup() |>
filter(week < max(week))
ggplot(data, aes(x = week, y = value, colour = status)) +
geom_line() +
scale_colour_brewer("", palette = "Set1") +
xlab("Date") + ylab("Cases") +
theme_bw() +
scale_y_log10()
You could avoid some of this issue by using truth data and anomalies from close to the end date of the study period.
Both - we're removing any truth data which has anomalies and any forecasts made when there was an anomaly
You could avoid some of this issue by using truth data and anomalies from close to the end date of the study period.
As the study cut-off is the 19th of July I am already in effect doing this. I've locked the data used to be from the 1st of September. On a related note is there anything in the literature you've seen about how to deal with revised epi data when evaluating models? Perhaps we should write a short note with European hub forecast data as an example (could be a good collab project).
Related commit locking the data extraction ( note the forecasts as these are assumed to be fixed at date of submission): https://github.com/epiforecasts/simplified-forecaster-evaluation/commit/dcc8319b52732cd921101631b8ecc20d296f619d
As the study cut-off is the 19th of July I am already in effect doing this. I've locked the data used to be from the 1st of September.
Just to be clear: in that case it would make sense to also use the anomalies file from 1 September (though not sure it makes much of a difference).
On a related note is there anything in the literature you've seen about how to deal with revised epi data when evaluating models? Perhaps we should write a short note with European hub forecast data as an example (could be a good collab project).
No, and I agree that it may be a bit of a gap worth addressing.
Just to be clear: in that case it would make sense to also use the anomalies file from 1 September (though not sure it makes much of a difference).
This is what I am doing. Everything from Sept 1st vs forecasts which are assumed to fixed by date of submission.
I am seeing something like 7.5% of forecasts being excluded and 10% of forecast dates by location having some kind of exclusion (i.e for at least one horizon). Sound about right?
https://github.com/epiforecasts/simplified-forecaster-evaluation/commit/a8a45f33ff679d984281196a6c3dda41b1c87bdd adds a discussion of anomalous observations + mention in limitations and further work.
Note looking at this more I think there is a very small bug in the anomaly handling code. Some locations handle forecast on different dates and so trying to work out the last forecast week by taking away two days doesn't work for all days.
This isn't the case - we permanently remove targets from evaluation if they see any data revision at any time that moves the value by >5%. These are (poorly) documented here (with source code)