CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

Wyoming,Albany time series count not updating since 1-Dec-20 #3439

Closed cabuerkle closed 3 years ago

cabuerkle commented 3 years ago

Counts for Albany County, Wyoming, USA have been stuck at 3120 and appear to have not been updated since 1-dec-20 in csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv (see below). This disagrees with counts in NYT database (https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv), which reports the following recent numbers: 2020-11-29,Albany,Wyoming,56001,3068,9 2020-11-30,Albany,Wyoming,56001,3108,9 2020-12-01,Albany,Wyoming,56001,3120,9 2020-12-02,Albany,Wyoming,56001,3136,9 2020-12-03,Albany,Wyoming,56001,3152,9 2020-12-04,Albany,Wyoming,56001,3180,9 2020-12-05,Albany,Wyoming,56001,3195,9 2020-12-06,Albany,Wyoming,56001,3219,9 2020-12-07,Albany,Wyoming,56001,3238,9 2020-12-08,Albany,Wyoming,56001,3256,9

time_series_covid19_confirmed_US.csv: 84056001,US,USA,840,56001.0,Albany,Wyoming,US,41.65498705,-105.7235415,"Albany, Wyoming, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,3,3,4,4,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,7,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,10,10,10,10,10,14,17,20,23,23,24,25,25,25,25,25,25,25,26,26,26,26,26,26,26,25,26,26,27,28,28,28,28,28,28,29,29,31,31,32,32,32,34,34,34,34,35,36,37,38,43,45,45,45,49,49,49,52,53,54,55,58,59,67,73,75,74,76,78,78,77,78,80,83,84,85,87,88,88,88,88,88,88,88,86,88,89,90,92,93,95,103,104,105,107,126,125,127,127,127,128,128,133,133,134,134,141,148,157,164,164,179,179,200,208,212,217,220,234,253,263,281,301,325,332,364,385,397,426,461,480,493,520,548,557,575,589,622,670,698,720,732,747,771,811,826,855,898,914,942,959,1015,1034,1067,1120,1131,1147,1167,1214,1247,1279,1371,1386,1422,1450,1510,1520,1562,1616,1646,1683,1693,1839,1855,1940,2017,2175,2182,2266,2335,2414,2484,2551,2647,2697,2745,2817,2830,2859,2958,2992,3021,3021,3053,3057,3068,3108,3120,3120,3120,3120,3120,3120,3120,3120

CSSEGISandData commented 3 years ago

Thank you, we are looking into the issue.

CSSEGISandData commented 3 years ago

Thanks for bringing this to our attention! It should be patched with #3440

cabuerkle commented 3 years ago

Thank you for this correction.

There appears to be a lingering error.

The data in jhu_covid-19/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv for Albany,Wyoming appear corrected for recent dates, but the last value is anomalous (it is 3120, which was the previously repeating value that started on 1-Dec-20, whereas the previous value was 3256).

The last value in the series should be for 12/8/20 according to the header, but the 12/8/20 data appear to be in the penultimate value and the 3120 appears to be a spurious addition. More generally going back in time, your data and for the NYTimes are offset by one day, going back to beginning of April.

Most recent days:

Date NYT JHU X12.1.20 3120 3136 X12.2.20 3136 3152 X12.3.20 3152 3180 X12.4.20 3180 3195 X12.5.20 3195 3219 X12.6.20 3219 3238 X12.7.20 3238 3256 X12.8.20 3256 3120

To compare the full series, in R:

# NYT data
nyt<-read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv")
## select FIPS 56001, which is Albany Co. Wyoming
nyt<-nyt[nyt$fips == 56001 & !is.na(nyt$fips),]

#JHU data -----
jhu<-read.csv("https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv")
jhu<-jhu[jhu$FIPS == 56001 & !is.na(jhu$FIPS),]
## select 3.25.20 (NYT data start on day of first case) through last date
jhu<-jhu[, which(names(jhu) == "X3.25.20"):ncol(jhu)]

data.frame(nyt=nyt$cases, jhu=as.numeric(jhu), row.names = names(jhu))