nytimes / covid-19-data

A repository of data on coronavirus cases and deaths in the U.S.
https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
Other
6.99k stars 3.46k forks source link

Data Issue: #660

Closed dwreck closed 2 years ago

dwreck commented 2 years ago

Describe the issue:

Fuller details

The data being reported for Washtenaw county in Michigan is way off as of late. It used to track rather closely. The "Cases and Deaths by County by Date of Onset of Symptoms and Date of Death" for Michigan can be pulled from here: https://www.michigan.gov/documents/coronavirus/Cases_and_Deaths_by_County_and_by_Date_of_Symptom_Onset_or_by_Date_of_Death2022-03-04_749370_7.xlsx

While that data is "by date of onset". It is not that difficult to see that the number are way off when you compare batches (weeks or two-week periods).

Over the past week (today is 3/4/22), your data shows increases in cases of 345, 348, 159, 141 for Washtenaw County. The State of Michigan data shows (over the same period) daily increases on the order of roughly 40. If you sum all the increases, your data shows a change of 2664 from 2/20/22 through 3/4/22, while the State of Michigan data shows a cumulative change of only 1304 (including probably cases).

Please have a look. There are a lot of institutions using your data here in Washtenaw County to make policy decisions. It would be a shame if those decisions were made from incorrect data.

lwaananenjones commented 2 years ago

Thank you for writing to us about this. We use data from state and county sources for Washtenaw County, and typically the county is ahead of the state in total cumulative cases. Our measures of "new" cases are calculated as the difference in the cumulative total from day to day.

Like you've noted, the "date of onset" is the difference, and it's a more significant difference right now because new cases have decreased so rapidly in recent weeks. Our data includes cases on the date they're added to the state or county total. If health officials receive a report this week about a positive test from two weeks ago, they have the information to backdate it to an earlier date, but it would appear in our data today. The county is showing 22,353 confirmed cases total for 2022 so far, by illness onset date, and our data has 22,174 confirmed cases for 2022, so the counts still align but may have more difference from week-to-week right now because of the large drop in cases.

dwreck commented 2 years ago

Thank you for this in-depth and thoughtful explanation! And thank you for all you are doing!!!