COVID-19 Data V2

This is a major change to the schema or the files as well as the content.

In an effort to provide the most accurate and up-to-date information we are making the following backwards incompatible changes as a way to set a new baseline. Users of the data can expect future changes to be incremental and not at the same scope.

Schema changes

Removing unreliable / low coverage fields

numTests - has very low coverage among regions
numActiveCases - is not reported reliably by most regions

New fields

Adding new daily new case counts for deaths and confirmed (tested positive) as well as recovered which will come in shortly
Adding new diff counts to show the difference in daily new counts between consecutive days
Adding 1 week rolling daily average for the daily new counts - useful for plotting an epi-curve

Simplifying Geographic Hierarchy

Removing the use of some in-dispute fields such as countyLocation , stateLocation and countryLocation and replacing with a parentId field. In this new schema a field is also allowed to have multiple parents which helps handle shared zipcodes and areas in contention.

Replacing hex IDs with plain text version

Making the dataset more readable through the use of plain text unique IDs

Increased coverage and support for smaller subregions

Added zipcode level stats for several US cities
Added borough level stats for UK and several other countries
Added Russian Oblasts

yahoo / covid-19-data

COVID-19 Data V2 #18