CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.16k stars 18.47k forks source link

All data for New York City shows in New York County - not distributed across other NYC counties #2212

Open johncdavis200 opened 4 years ago

johncdavis200 commented 4 years ago

Looking at the data in time_series_covid19_confirmed_US.csv and time_series_covid19_deaths_US.csv, I see that all of the data for New York City is being reported as being from New York County - which is the Manhattan borough. No data is being reported for Richmond county (Staten Island), Kings county (Brooklyn), Queens county (Queens), or Bronx county (Bronx). This is a problem for those of us who are wanting to study the situation in NYC.

I have been getting county level data from: https://usafacts.org/visualizations/coronavirus-covid-19-spread-map/ which initially was doing the same thing - but for many weeks now has been breaking the data out by the NYC counties.

It would be most helpful if you all could do this as well - then I could rely on one source of data rather than two as I am having to continue to do now. Thanks!

trimeta commented 4 years ago

Some more details about this issue:

The county-level daily reports (as found in csse_covid_19_data/csse_covid_19_daily_reports) omit the counties of New York City which aren't "New York" (e.g., Manhattan). However, the time series data (as found in csse_covid_19_data/csse_covid_19_time_series) does have the other four counties -- but their numbers are all zeros.

What's even more confusing is that the recently-deployed interactive US map (https://coronavirus.jhu.edu/us-map) does have information for all five counties of New York City, separate and distinct. It's not clear where this data is coming from.

mcroebuck commented 4 years ago

Thanks. I also just found this in the data, and luckily found this post. I guess the best thing to do is manually adjust these data to match what the US map dashboard reflects. It is indeed very odd that raw data are obviously not driving the live dashboard.

maps-apps-n commented 4 years ago

Have there been any updates to this issue? I see from the thread above that Brooklyn and Manhattan may be reflected in New York, but the Bronx is still showing as standalone with 0 cases.

LangeJM commented 4 years ago

I was running into the same issue and eventually found a by-county dataset of usafacts that makes this distinction.. Edit: According to this article it seems to be reliable

kevinp2 commented 4 years ago

Just bumping this to the top. It's very hard to create a map that includes New York City and other counties at the same time because of this issue. Is it possible to break out the time series by the five boroughs (counties)?

dcleblanc commented 4 years ago

@kevinp2 - what they're doing is aggregating New York, Kings, Queens, Richmond, and Bronx. You can get population figures here - https://worldpopulationreview.com/us-counties/ny/.

Where this does screw things up is if you say put the state data into a pivot table, and aggregate by state, since Kings, Queens, and Bronx all have population that's also aggregated into New York, and you'll double-count people by a significant level.

kevinp2 commented 4 years ago

@dcleblanc , I was able to work around the Population problem. But I am building a data extract for import into the BI tools Qlik and Tableau and by golly, those tools simply do not like mixing Cities and Counties in the same map.

I think I have a hack workaround, by naming all the NYC data to be "New York", thereby masquerading as New York County, i.e. Manhattan. So I can at least get the NYC numbers onto the map. But of course, the other 4 boroughs show up as Covid free.