plot_predictions: connect to last data point?

epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions

https://epiforecasts.io/scoringutils/

Other

48 stars 21 forks source link

plot_predictions: connect to last data point? #136

Closed sbfnk closed 1 year ago

sbfnk commented 3 years ago

Would it make sense to change plot_predictions so that it connects to day on which the forecast was made? This is the way it is done in the forecast hub visualisation app. Using plot_predictions there is a gap between the last data point and the first forecast:

vs.

seabbs commented 3 years ago

Slight complexity with this is getting this to work with the current support for being able to plot without truth data.

https://github.com/epiforecasts/scoringutils/blob/6895b49125c2dba7489c37896ac98bcacdd46e1e/R/plot.R#L766

I think dropping that option (i.e truth data is required) seems like the sensible choice.

nikosbosse commented 3 years ago

Yes probably seems like a good idea!

Bisaloo commented 3 years ago

For context, the way the angular app achieves this is by duplicating the latest truth point and converting it to a '0-day forecast' but I'm not convinced this is the simplest solution.

seabbs commented 3 years ago

That is the only solution I can think of I think...

Bisaloo commented 3 years ago

The reason I say it's not the simplest/best solution is that it makes difficult/impossible to have a plot that includes both the forecasts and the truth data at the same dates (e.g., if you want to visualize after the fact how accurate the forecasts were). We are faced with this problem for the hub's angular app and I don't see an easy way out.

nikosbosse commented 3 years ago

One plot that I created was this one:

I'd say having truth and forecast worked reasonably well. Is that what you had in mind, @Bisaloo?

Bisaloo commented 3 years ago

Yes, it looks good!

Bisaloo commented 2 years ago

Something to keep in mind: in the hub, we had a situation were the truth data arrives late. So we don't have truth data for the last week and the forecasts should not connect to the latest truth data in this case.

I don't know if it's a problem with your implementation or not.

nikosbosse commented 2 years ago

Code is from here: https://github.com/epiforecasts/europe-covid-forecast/blob/c7da90ae11be572ab80d36b0591aeb43c54eafbf/paper/Analysis.Rmd#L162 and the document I created is here: https://www.crowdforecastr.org/uk-challenge-evaluation It also seems a bit hacky.. My implementation won't really work if the truth data isn't there, maybe it could fail more gracefully by just replacing it with NA?

Bisaloo commented 2 years ago

I'm currently stuck on the question on missing values. As mentioned, it is desirable to leave a gap when recent data is missing. It's what we do in the hub, as illustrated in the screenshot below.

However, this is only possible in the hub because we know the time step: we have one point for each week. We don't know this in the general case and I cannot think of a really good/robust way to infer it from the data.

nikosbosse commented 2 years ago

There could be an option whether the data should connect or not and some .... If you want it to connect then it tries to infer this and if that fails asks you to provide a time step.

seabbs commented 2 years ago

Close in favour of #176