CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.14k stars 18.44k forks source link

data issue #87

Closed dawenx closed 4 years ago

dawenx commented 4 years ago

@CSSEGISandData

When I checked the deaths # for Fujian province in China in https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports, ALL files have 0 value. See below grep:

01-22-2020.csv:Fujian,Mainland China,1/22/2020 17:00,1,, 01-23-2020.csv:Fujian,Mainland China,1/23/2020 17:00,5,, 01-24-2020.csv:Fujian,Mainland China,1/24/2020 17:00,10,, 01-25-2020.csv:Fujian,Mainland China,1/25/2020 17:00,18,, 01-26-2020.csv:Fujian,Mainland China,1/26/2020 16:00,35,, 01-27-2020.csv:Fujian,Mainland China,1/27/2020 23:59,59,, 01-28-2020.csv:Fujian,Mainland China,1/28/2020 23:00,80,, 01-29-2020.csv:Fujian,Mainland China,1/29/2020 19:30,84,, 01-30-2020.csv:Fujian,Mainland China,1/30/2020 16:00,101,, 01-31-2020.csv:Fujian,Mainland China,1/31/2020 23:59,120,, 02-01-2020.csv:Fujian,Mainland China,2/1/2020 5:37,144,0,0 02-02-2020.csv:Fujian,Mainland China,2020-02-02T03:43:01,159,0,0 02-03-2020.csv:Fujian,Mainland China,2020-02-03T11:33:13,179,0,1 02-04-2020.csv:Fujian,Mainland China,2020-02-04T12:13:11,194,0,3 02-05-2020.csv:Fujian,Mainland China,2020-02-05T12:33:01,205,0,11 02-06-2020.csv:Fujian,Mainland China,2020-02-06T11:03:19,215,0,14 02-07-2020.csv:Fujian,Mainland China,2020-02-07T13:23:03,224,0,20 02-08-2020.csv:Fujian,Mainland China,2020-02-08T08:13:10,239,0,24 02-09-2020.csv:Fujian,Mainland China,2020-02-09T09:13:11,250,0,35 02-10-2020.csv:Fujian,Mainland China,2020-02-10T14:03:05,261,0,39 02-11-2020.csv:Fujian,Mainland China,2020-02-11T14:03:05,267,0,45 02-12-2020.csv:Fujian,Mainland China,2020-02-12T11:53:02,272,0,53 02-13-2020.csv:Fujian,Mainland China,2020-02-13T10:33:23,279,0,57 02-14-2020.csv:Fujian,Mainland China,2020-02-14T11:13:22,281,0,63 02-15-2020.csv:Fujian,Mainland China,2020-02-15T13:03:04,285,0,71 02-16-2020.csv:Fujian,Mainland China,2020-02-16T12:03:06,287,0,82 02-17-2020.csv:Fujian,Mainland China,2020-02-17T10:23:04,290,0,90 02-18-2020.csv:Fujian,Mainland China,2020-02-18T09:43:08,292,0,93 02-19-2020.csv:Fujian,Mainland China,2020-02-19T11:53:02,293,0,112

This definitely is wrong as Fujian reported one and the only one death probably on 01/25 and you can see this from archived data https://github.com/CSSEGISandData/COVID-19/blob/master/archived_data/archived_daily_case_updates/01-25-2020_2200.csv#L18

Your dashboard also shows: Screenshot_2020-02-20 Coronavirus COVID-19 (2019-nCoV)

I suspect there are similar issue for other provinces, will report back.

dawenx commented 4 years ago

A similar issue happens on Shandong province, according to archived data, the first death for Shandong was on 01/27, see: https://github.com/CSSEGISandData/COVID-19/blob/master/archived_data/archived_daily_case_updates/01-27-2020_2030.csv#L10

But in https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports, the first death was on 02/09.

01-22-2020.csv:Shandong,Mainland China,1/22/2020 17:00,2,, 01-23-2020.csv:Shandong,Mainland China,1/23/2020 17:00,6,, 01-24-2020.csv:Shandong,Mainland China,1/24/2020 17:00,15,, 01-25-2020.csv:Shandong,Mainland China,1/25/2020 17:00,27,, 01-26-2020.csv:Shandong,Mainland China,1/26/2020 16:00,46,, 01-27-2020.csv:Shandong,Mainland China,1/27/2020 23:59,75,, 01-28-2020.csv:Shandong,Mainland China,1/28/2020 23:00,95,, 01-29-2020.csv:Shandong,Mainland China,1/29/2020 19:30,130,,1 01-30-2020.csv:Shandong,Mainland China,1/30/2020 16:00,158,,1 01-31-2020.csv:Shandong,Mainland China,1/31/2020 23:59,184,,2 02-01-2020.csv:Shandong,Mainland China,2/1/2020 7:51,206,0,3 02-02-2020.csv:Shandong,Mainland China,2020-02-02T18:03:05,230,0,6 02-03-2020.csv:Shandong,Mainland China,2020-02-03T17:11:34,259,0,7 02-04-2020.csv:Shandong,Mainland China,2020-02-04T12:03:04,275,0,11 02-05-2020.csv:Shandong,Mainland China,2020-02-05T10:13:13,307,0,15 02-06-2020.csv:Shandong,Mainland China,2020-02-06T07:53:02,347,0,27 02-07-2020.csv:Shandong,Mainland China,2020-02-07T11:33:11,386,0,37 02-08-2020.csv:Shandong,Mainland China,2020-02-08T11:33:02,416,0,44 02-09-2020.csv:Shandong,Mainland China,2020-02-09T15:03:05,444,1,63 02-10-2020.csv:Shandong,Mainland China,2020-02-10T09:33:02,466,1,66 02-11-2020.csv:Shandong,Mainland China,2020-02-11T11:23:04,487,1,80 02-12-2020.csv:Shandong,Mainland China,2020-02-12T11:13:05,497,2,92 02-13-2020.csv:Shandong,Mainland China,2020-02-13T13:33:01,509,2,105 02-14-2020.csv:Shandong,Mainland China,2020-02-14T11:13:22,523,2,136 02-15-2020.csv:Shandong,Mainland China,2020-02-15T11:23:17,532,2,156 02-16-2020.csv:Shandong,Mainland China,2020-02-16T12:03:06,537,2,173 02-17-2020.csv:Shandong,Mainland China,2020-02-17T11:03:06,541,2,191 02-18-2020.csv:Shandong,Mainland China,2020-02-18T12:13:08,543,3,211 02-19-2020.csv:Shandong,Mainland China,2020-02-19T12:03:07,544,3,231

CSSEGISandData commented 4 years ago

Hello @dawenx! Thanks for pointing them out. After checking with the historical screenshots, we are sure those manual inputs were wrong. We will remove 1 death from Shandong on 01-27-2020_2030.csv and 1 death from Fujian on 01-25-2020_2200.csv.

dawenx commented 4 years ago

Thanks for your clarification @CSSEGISandData

Can you check Heilongjiang province as well? In particular, there's no death in this file: https://github.com/CSSEGISandData/COVID-19/blob/master/archived_data/archived_daily_case_updates/01-25-2020_2200.csv#L19

But, there's death in file before/after this:

01-25-2020_1200.csv:Heilongjiang,Mainland China,1/25/2020 12pm,9,1,,,9 01-26-2020_1100.csv:Heilongjiang,Mainland China,1/26/20 11:00,15,1,,,15

dawenx commented 4 years ago

any update on Heilongjiang problem in above comment? @CSSEGISandData

CSSEGISandData commented 4 years ago

Yes, Heilongjiang should have 1 death in that file. Updated. Thanks for helping us.