CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

Population for Manhattan (New York County FIPS 36061) is incorrect in time_series_covid19_deaths_US.csv #1970

Open jjdfsny opened 4 years ago

jjbenes commented 4 years ago

Also stumbled on this while computing per capita statistics. The population for FIPS 36061 (New York County) is 1,628,706 but the look-up table in https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/UID_ISO_FIPS_LookUp_Table.csv has 4.2M extra people.

84036061 US USA 840 36061 New York City New York US 40.7672726 -73.97152637 New York City, New York, US 5803210 [should be 1,628,706]

Screen Shot 2020-04-11 at 7 09 28 PM

markmi commented 4 years ago

I'll note that Manhattan (New York County FIPS 36061) is the only borough of the City's 5 that has non-zero death figures. It appears that it is being used for the total for all 5 boroughs, no breakout available.

The population figure is wrong for either the 5-borough-total or as a figure to add to the other 4 borough figures and, as, stands, needs to be avoided for per-capita calculations and the like.

jjbenes commented 4 years ago

I hope I got New York right. I started taking covid19 data from USAFacts.org. They have all the cases reported by all the counties.

See this map for county cases: https://first-principles.ai/covid-19/map.html. Mouse over the map to see per-capita county data. This map ranks per-capita cases on a daily basis starting in 22nd Jan: https://first-principles.ai/covid-19/per-capita-map.html. This site zooms into the county you type into the search box. Seems to work much faster than USAFacts.org.

Screen Shot 2020-04-17 at 4 48 10 AM

jjbenes commented 4 years ago

@markmi Yes, for reasons I don't understand, the data for the five boroughs of NYC are included in NY County without any breakout. (I think NYT does that, too.) But USAFacts has the breakout. I updated the code to take both Johns Hopkins and USAFacts data: https://github.com/jjbenes/covid19.

See cell 17 for NYC in the two Jupyter notebooks here: