Verhältnis zu RKI Infektionszahlen/Meldezahlen

micb25 / dka

Statistische Analyse und Visualisierung der täglichen Diagnoseschlüssel der deutschen COVID-19 Tracing-App (Corona-Warn-App).

https://micb25.github.io/dka/

GNU General Public License v3.0

91 stars 5 forks source link

Verhältnis zu RKI Infektionszahlen/Meldezahlen #3

Closed kai-truempler closed 4 years ago

kai-truempler commented 4 years ago

Vielen Dank für die initiative. Wäre es sinnvoll, als Basis für das Verhältnis zu den RKI-Zahlen die Zahl der Fälle nach Meldetatum zu nehmen ("Für die Darstellung der neuübermittelten Fälle pro Tag wird das Meldedatum verwendet – das Datum, an dem das lokale Gesundheitsamt Kenntnis über den Fall erlangt und ihn elektronisch erfasst hat."- aus dem RKI-Dashboard Disclaimer)? Das könnte besser passen, als die tägl. Differenz der Gesamtzahl der Infizierten, schleppt aber immer drei, vier Tage nach.

micb25 commented 4 years ago

Vielen Dank für die Anregung. Die RKI-Tageszahlen waren erst einmal einfach und schnell zu implementieren.

Ich bin mir über die Meldeverzögerungen der RKI-Zahlen bewusst. Das Meldedatum wäre ein guter Startpunkt, allerdings kommt es beinahe täglich zu Nachmeldungen beim RKI. Dies bedingt leider, dass sich das dargestellte Verhältnis von positiv getesteten auch noch nach Tagen für zurückliegende Tage ändern könnte.

Ein vorgeschlagener Kompromiss wäre es, die Zahlen der Johns Hopkins University oder von zeit.de zu verwenden, die eine geringere Meldeverzögerung aufzeigen. Dazu müsste ich aber erst einmal schauen, wo man davon aktuelle maschinenlesbare Daten erhält.

kai-truempler commented 4 years ago

Ja, das Problem mit den Tageszahlen ist aus meiner Sicht, dass sie nur die Differenz der Gesamtzahlen von Tag zu Tag abbilden. Über das WE z.B. melden einige Länder gar nicht und häufiger gibt es größere Schwankungen, wenn etwa ein Land seine Zahlen bereinigt oder ein Softwareupdate einspielt. Für diese R-Berechnung wurde es so gelöst, dass die Zahlen als vorläufig markiert werden.

Die Änderung für zurückliegende Tage wäre allerdings unübersichtlich und womöglich schwer nachzuvollziehen. JHU fand ich bei kleinen Zahlen manchmal etwas erratisch, zeit.de wäre näher dran, aber ich weiß auch nicht, ob dort ein Download möglich ist.

Vielleicht sind die Tageszahlen bei allen Problemen doch eine gute Näherung und Schwankungen mitteln sich raus? Ggf. ginge dann auch hier ein gleitender Durchschnitt? Letztlich ist hier ja so viel Schätzung im Spiel, dass es ohnehin im wesentlichen auf ein Gefühl für die Größenordnung ankommt.

cfritzsche commented 4 years ago

jhu_daily

Can you somehow scrape it from here? https://coronavirus.jhu.edu/map.html

Zeit.de doesnt seem well suited to scrape daily cases. There is no table there and the chart is miniscule. And they report the number per 100k people only.
Google shows Wikipedia data which seems to be RKI data. Pavels data is also RKI data. https://pavelmayer.de/covid/risks
ourworldindata would have been glorious, but their source is again ECDC which takes it from RKI.

kai-truempler commented 4 years ago

JHU's data may also available here. There seem to be some weird data points, though. But again, it'll be fine if you just want to get an idea where the numbers are.

micb25 commented 4 years ago

Thanks for all the links! It's a pity that after 5 months into this the delay in the RKI data is still a huge issue besides other problems.

JHU's data may also available here. There seem to be some weird data points, though. But again, it'll be fine if you just want to get an idea where the numbers are.

This seems easy to be implemented. I will give this a try within the next few days.

kai-truempler commented 4 years ago

Thanks again for the initiative, and you can totally do it that way; we just have to be aware that when the numbers a small, the difference can be quite substantial when calculating ratios. E.g. for 26 June, the numbers are: JHU: 391; RKI-Meldezahlen: 560[it is not, as of this writing ~~499~~, mistake was mine, not an RKI update :)]; zeit.de (tab: Fälle): 582; RKI-neu: 477. Since the idea of the graph is to get information on the ratio of infected people to the ones notifying that risk using the app, I would like to float the idea of a rolling average again (that could use RKI-neu, too).

daimpi commented 4 years ago

There is also the data of new infections in the RKI Nowcast: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Projekte_RKI/Nowcasting.html

The data is in a xlsx file

Tabelle mit Nowcasting-Zahlen zur R-Schätzung (30.6.2020, Tabelle wird täglich aktualisiert) (xlsx, 18 KB, Datei ist nicht barrierefrei)

The current version (30.06.) has data up until the 26.06. This might be more interesting in the future for a rolling average type of plot with some delay but more stable data.

micb25 commented 4 years ago

I have added a correlation with JHU data.

I would like to float the idea of a rolling average again (that could use RKI-neu, too).

That's what I also would prefer. Furthermore, it was suggested in the UKW033 podcast.

kai-truempler commented 4 years ago

Great, thanks for taking this up. I actually just listened to that podcast and I have to agree with them that it is excellent of you to put this together.

MikeJayDee commented 4 years ago

As far as I am aware, JHU, Zeit, tagesspiegel, etc. all source their German data from this Risklayer spreadsheet: https://docs.google.com/spreadsheets/d/1wg-s4_Lz2Stil6spQEYFdZaBEp8nWW26gVyfHqvcl8s/edit#gid=0

This is crowdsourced data as published by press release by the Landkreise mostly.

micb25 commented 4 years ago

I would like to float the idea of a rolling average again (that could use RKI-neu, too).

This has been added now in this commit.

kai-truempler commented 4 years ago

Let me thank you again for this excellent service.