Priesemann-Group / covid19_inference_forecast

GNU General Public License v3.0
178 stars 53 forks source link

Epi Curve (onset of illness) as alternative data source #7

Closed matthiaslinden closed 4 years ago

matthiaslinden commented 4 years ago

I added data from Epi Curve (onset of illness) from RKI's 09.04.2020 Bulletin Abb.2 as additional datasource ( RKI bulletin ). For an up to date (more or less 04.04.2020) dataset, the 'nowcast'-dataset is chosen in the new example notebook.

see http://mlinden.de/COVID19/index.html for more info.

jdehning commented 4 years ago

The data seems to be public about when the illness onset is: https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets/dd4580c810204019a7b8eb3e0b329dd6_0?selectedAttribute=Refdatum Referenzdatum, looking at the description, is the date of the illness onset. Only caveat: for about 23% of the data, there don't seem to be this information, as it's the same date as the Meldetatum.

matthiaslinden commented 4 years ago

I didn't know that source! You are right, Refdatum cases match my other sources for the EpiCurve. As far as I can tell at the moment (14.04.2020) that Database only contains Values where both "Refdatum" and "Meldedatum" are known ~93k in total, in comparison to ~123k cases in the situation report today. The situation report's EpiCurve has 76k known dates of onset (62% of total) and 46k cases with unknown date of onset or asymptomatic. The arcgis Database only seems to cover some of those unknown cases, that 23%.

I generated a series of plots to compare the different sources: https://github.com/matthiaslinden/Covid19_DayOfOnset_Germany/blob/master/ComparisonOfSources.md A notebook is available as well.

matthiaslinden commented 4 years ago

I think it's obsolete, as this functionality has been added.