nytimes / covid-19-data

A repository of data on coronavirus cases and deaths in the U.S.
https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
Other
6.99k stars 3.46k forks source link

Enhancement: per capita should use metro/micro statistical areas, not people per square mileage of county #364

Closed rseymour closed 4 years ago

rseymour commented 4 years ago

Describe the feature you would like to see

Another issue opened had tabular data per MSA, but your recent maps have done the cut by density per county. Many counties have 1 or two cities (or even micropolitan statistical areas) without many folks living outside of them. This can be easily seen by diffing your maps with those that our census puts out: https://www.census.gov/geographies/reference-maps/2018/geo/cbsa.html

Describe alternatives you've considered

I tweeted about it but I figured opening an issue is the best way to get your attention. :) Thanks for your work.

albertsun commented 4 years ago

I'm not sure what you mean here, when we show per 100,000 case counts it is based on the county population. There is no calculation based on density other than to selectively not shade low population areas.

rseymour commented 4 years ago

Combination of confirmation bias on my part (my home county, westernmost in NYS has been good enough to stay unshaded) led me to completely misread "parts of a county" as "county". My bad. I still think it wouldn't hurt to call the MSA's MSA's instead of "parts of a county" but it's all good. I realized as I looked further west on the map that you were doing something more complex than just dropping counties.

Sorry!

rseymour commented 4 years ago

But this does speak to the issue of coding 'empty space' and 'doing well' the same can be problematic

armsp commented 4 years ago

@albertsun @rseymour I believe what we were discussing here about "Parts of a county" is this - california

The area within black polygon is the county. Inside it we have some shaded regions.

I understand that the data is only at a county level, but the region shaded is based on something else - population density of parts of a county.

What I am having a difficult time to understand is, on a physical level, what are those shaded regions called i.e what exactly is the moniker of parts of a county? Are they "Block Groups", "Census Tracts". I don't think they are MSAs cause there are a very few MSAs compared to counties in the USA. Basically since it's a map, there was a shapefile for it. The question is, those shaded regions are of a shapefile of what exactly?

armsp commented 4 years ago

@albertsun and other maintainers, I know this has been closed, but it would be really helpful if you could look into it. It should be pretty straightforward I suppose. I don't think those regions are Zip Code areas? Are they? If not then what exactly are those areas?

armsp commented 4 years ago

@albertsun I was really hoping to get some clarity on what that graph shows as a physical area. Any hint would be really great. Please do look into it. I don't think it will take long.

It looks like they might be Census Tracts, is that right?

albertsun commented 4 years ago

@armsp I'm curious, what do you need that information for?

It's not zip codes or MSAs, the maps are derived from various different census geographies based on population, merged into a custom generated shapefile.