CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

A question about missing FIPS in US #2287

Open cpyic opened 4 years ago

cpyic commented 4 years ago

Want to ask if the four entry without FIPS code were exclusive from their representative FIPS counterparts.

For "time_series_covid19_confirmed_US.csv" there were four entry without fips, including:

  1. Dukes and Nantucket: [Dukes County (25007) and Nantucket County (25019)]
  2. Kansas City: [composed of Jackson county (29095), Clay (29047), Platte (29165), Cass (29037)]
  3. FCI Milan in Michigan: [address in Washtenaw county (26161)]
  4. MDOC Michigan department of corrections: [address in Jackson county (26075)]

For 1., numbers for the separate Dukes and Nantucket counties were 0 in the 4/18 update, but "Dukes and Nantucket" reported 23. This seemed fine as they were mutually exclusive and would not be counted twice when summed.

For 2., Based upon wikipedia Kansas City included the four counties listed above. But each individual 4 counties all reported numbers (268, 53, 27, 56), which when added were very close to 412 of Kansas City on 4/18. Then Kansas City was counted twice when summed up for Missouri. Or were these mutually exclusive?

Would 3. and 4. be the same as for 1., so the FCI and MDOC were not added to the county where they were located?

Many thanks!

us_20200418

jjbenes commented 4 years ago

@cpyic If you want U.S. county data consistently identified by FIPS, consider using data from USAFacts.org. They get data directly from county governments. See the data for Nantucket and Dukes in this screenshot. Screen Shot 2020-04-19 at 11 53 48 PM

Aside from Kansas City, as you pointed out, there's also at least one other difference and that's for NYC. The five boroughs of NYC from the JHU database having zeros in four of the five FIPs. NY County (Manhattan) has the sum of all five boroughs for reasons that I don't quite understand.

Screen Shot 2020-04-20 at 12 03 09 AM

Hope this bit of info helps.

kevinp2 commented 4 years ago

Want to ask if the four entry without FIPS code were exclusive from their representative FIPS counterparts.

I have run into the same issue when trying to build county-level maps with Qlik and Tableau BI tools. These tools simply do not like having cities and counties in the same map, especially when a city sprawls across multiple counties. I cannot get Kansas City, Missouri to render at all.

If CSSE would confirm that the Kansas City numbers are already contained in its four counties, then I can discard Kansas City as an entity and just plot its counties instead. The virus doesn't pay attention to the lines on the map anyway!

kevinp2 commented 4 years ago

If CSSE would confirm that the Kansas City numbers are already contained in its four counties, then I can discard Kansas City as an entity and just plot its counties instead. The virus doesn't pay attention to the lines on the map anyway!

So I did the math:

The four counties of Jackson, Clay, Cass and Platte have a total of 710 cases as of May 19.

The entity "Kansas City, Missouri" has a total of 902 cases despite being smaller than the four counties together!

So no idea which one is correct. I am inclined to filter out "Kansas City, Missouri" since it is impossible to map it using much commercial software.

jjbenes commented 4 years ago

@kevinp2 Do you have to use JHU data? I didn't want to have to filter data, so I used USAFacts.org. I got 50 states and DC. Then I added some counties together to get metro areas. See this page: http://first-principles.ai:5100/compare_states.