CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.12k stars 18.4k forks source link

Wrong numbers of cases for Germany on March 20 - March 21st #3857

Closed pruggia closed 3 years ago

pruggia commented 3 years ago

The number of total cases for Germany is listed like this on the files:

March 19 - 2,654,734 March 20 - 2,669,233 March 21 - 2,670,001

That gives a number of daily cases for March 21st of 768 which seems rather low.

Checking at the source from where JHU gets the Germany data (https://interaktiv.morgenpost.de/corona-virus-karte-infektionen-deutschland-weltweit/) it seems that the number for March 21st matches JHU, but the numbers for the previous days does not. Was there a change on where does JHU get the numbers from around March 21? Here are the numbers reported in the morgenpost site:

March 19 - 2,646,107 March 20 - 2,659,792 March 21 - 2,670,001

CSSEGISandData commented 3 years ago

Hello,

This is related to #3417. I'll copy and paste the response we gave to that issue. In general, the issue is due to our use of a combination of the national level case estimates and data directly from the federal states which are generally on different update schedules.

"Hello Ed,

Thank you for your message. We are aware of OWID switching to our data and we are grateful that you view our work suitable to be sharing with your audience. We also have received your email from several days ago and are still working through a response.

In regards to Germany, our data is sourced from the Berliner Morgenpost rather than the Robert Koch Institute (RKI). We use this source as they are slightly more timely than RKI as they pull data directly from the health departments of Federal States which are generally ahead of the data presented by RKI. For example, a direct comparison of the state-level data between RKI and the Berliner shows that our source is ahead for 12 of the 16 federal states.

The irregularity is due to the means by which the Berliner calculates the total for an unassigned category. The sum of the totals in the federal states does not equal the total number of cases in the country as tracked by the Karlsruhe Institute of Technology, so the source creates an unassigned category that is the difference between this national total and the sum of the federal states. The national total generally is more timely early in the week but the reporting of the federal states and RKI "catches up" over the week, resulting in decreases in the unassigned category that have net zero effect on the total cases and deaths at the national level.

The seven day rolling average looks smooth for this reason. I'll leave this issue open so others can look at this response if they are curious."