CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.1k stars 18.39k forks source link

Why does the John Hopkins data differ from the CDC data in terms of NEW cases? #1876

Open AZEconoGal opened 4 years ago

AZEconoGal commented 4 years ago

Dear John Hopkins staff:

It is disconcerting that you have no contact information for your data. While I understand that you are pulling data from WHO, CDC and the states-- your daily numbers are showing a totally different trajectory than WHO, CDC and the state health organizations that I follow.

I read that some of your numbers are 5-day centered moving averages with ESTIMATES for the following 2 days. Could your estimates be off? How do you calculate those?

Any input would be appreciated.

Debra Roubik Economist VisionEcon

JeroenKools commented 4 years ago

I read that some of your numbers are 5-day centered moving averages with ESTIMATES for the following 2 days.

Where did you read that?

AZEconoGal commented 4 years ago

"This analysis uses a 5-day moving average to visualize the number of new COVID-19 cases and calculate the rate of change. This is calculated for each day by averaging the values of that day, the two days before, and the two next days." https://coronavirus.jhu.edu/data/new-cases

That would explain why your numbers are so divergent from WHO. I understand that with WHO there are lags. I get that. But, I would never recommend using estimates for the future 2 days. Your Expected Value for the root mean squared error would be considerable.

On Thu, Apr 2, 2020 at 4:49 PM Jeroen Kools notifications@github.com wrote:

I read that some of your numbers are 5-day centered moving averages with ESTIMATES for the following 2 days.

Where did you read that?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CSSEGISandData/COVID-19/issues/1876#issuecomment-608151283, or unsubscribe https://github.com/notifications/unsubscribe-auth/APAZYCWAYG2VZDSF3X6F4NDRKUP73ANCNFSM4LZSGMKQ .

--

Debra J. Roubik

VisionEcon

http://www.visionecon.net/

Your Small Business Resource: Business plans, financial models, funding assistance, market research, market strategy and feasibility studies

623-340-4048 <(623)%20340-4048>

droubik@visionecon.net

http://www.visionecon.net/

JeroenKools commented 4 years ago

So it's averaged in that visualization... Not in the data source.

AZEconoGal commented 4 years ago

Thanks Jeroen. But, what makes your data so much higher than WHO even with the lags?

On Thu, Apr 2, 2020 at 5:06 PM Jeroen Kools notifications@github.com wrote:

So it's averaged in that visualization... Not in the data source.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CSSEGISandData/COVID-19/issues/1876#issuecomment-608157137, or unsubscribe https://github.com/notifications/unsubscribe-auth/APAZYCVQ6DHMRCKCGFZWMPLRKUSAJANCNFSM4LZSGMKQ .

--

Debra J. Roubik

VisionEcon

http://www.visionecon.net/

Your Small Business Resource: Business plans, financial models, funding assistance, market research, market strategy and feasibility studies

623-340-4048 <(623)%20340-4048>

droubik@visionecon.net

http://www.visionecon.net/

JeroenKools commented 4 years ago

Comparing JHU with WHO and Worldometers data for April 2:

WHO JHU Worldometers.info
China 82,724 82,432 81,589
US 187,302 243,453 244,433
Iran 47,593 50,468 50,468
Italy 110,574 115,242 115,242
Spain 102,136 112,065 112,065

I guess there are some discrepancies, noticeably so for the US. FWIW, the CDC reported 213,144 for 4/2, which is very close to JHU's 213,372 for 4/1! And the WHO's 187k is very close to JHU's 188,172 for 3/31. So it looks like the WHO and CDC are lagging behind.

It's also possible JHU is counting something double due to their earlier issues with information per US state, county and cities.