Table of contents:
- About, Used by
- Visualizations
- Datasets: JHU, NY Times, ECDC, example
- Attribution, Licensing
This repository contains various datasets related to COVID-19 (JHU CSSE, NY Times, ECDC):
md5
file that can be used as an index at (https://data.volution.ro/ciprian/f8ae5c63a7cccce956f5a634a79a293e/imports/.md5));md5
file that can be used as an index at (https://data.volution.ro/ciprian/f8ae5c63a7cccce956f5a634a79a293e/exports/.md5));Also some visualizations based on the derived datasets are available at:
md5
file that can be used as an index at (https://data.volution.ro/ciprian/f8ae5c63a7cccce956f5a634a79a293e/exports/.md5));None of these datasets were collected by me, however I have re-processed, re-formatted and augmented them for easier manipulation.
As with anything on the Internet these days, I take no responsibility for anything. :)
I have created a few groups of countries / regions, based on the derived datasets, and for each one I've plotted all the available metrics:
global
-- JHU
-- first 25 world-wide countries ordered by confirmed cases;global-major
-- JHU
-- world-wide countries with more than 4M confirmed cases;global-medium
-- JHU
-- world-wide countries with more than 1M confirmed cases, but less than 4M;global-minor
-- JHU
-- world-wide countries with more than 500K confirmed cases, but less than 1M, limited to 20 countries;europe
-- ECDC or JHU
-- first 25 European countries ordered by confirmed cases;europe-major
-- ECDC or JHU
-- European countries with more than 1.5M confirmed cases;europe-medium
-- ECDC or JHU
-- European countries with more than 500K confirmed cases, but less than 1.5M;europe-minor
-- ECDC or JHU
-- European countries with more than 100K confirmed cases, but less than 500K, limited to 20 countries;us
-- NY Times
-- first 25 US states ordered by confirmed cases;us-major
-- NY Times
-- US states with more than 1M confirmed cases;us-medium
-- NY Times
-- US states with more than 500K confirmed cases, but less than 1M;us-minor
-- NY Times
-- US states with more than 100K confirmed cases, but less than 500K, limited to 20 states;world
-- JHU
-- overall aggregated values;continents
-- JHU
-- aggregated countries grouped by continents;subcontinents
-- JHU
-- aggregated countries grouped by sub-continents;romania
-- JHU or ECDC
-- Romania, Hungary, Bulgaria and a few other countries for comparison;
daily
dataset (includes world countries and US counties, plus higher level aggregates):series
dataset (includes world countries and US states, plus higher level aggregates):.zst
extension), or with gzip
compression (just replace .zst
with .gz
);day_index_*
means how many days have passed for that country since there were at least that many confirmed cases;absolute_pop100k
means the absolute metric per 100k people in that country / region;relative_*
means the percentage of that metric relative to the number of confirmed cases for that same day;delta_*
means the delta of that metric compared to the same metric for the previous day;*_infected
means the number of "active" cases (i.e. infected := confirmed - recovered - deaths
);us-counties
dataset (includes only US counties, plus higher level aggregates):us-states
dataset (includes only US states, plus higher level aggregates):.zst
extension), or with gzip
compression (just replace .zst
with .gz
);europe
dataset (includes EU/EEA countries) (this dataset is currently maintained by ECDC, and contains data from 2021-03-01):worldwide
dataset (includes world countries, plus higher level aggregates) (this dataset is no longer maintained by ECDC, and contains data until 2020-12-14):.zst
extension), or with gzip
compression (just replace .zst
with .gz
);europe
dataset is available at ecdc.europa.eu;worldwide
dataset is available at ecdc.europa.eu;values.json
file);status.json
file);values.json
example extract[
...
{
"dataset": "jhu/daily",
"location": {
"key": "fb583ceb1834efe5f595d1d7ac84a7f1",
"type": "total-country",
"label": "Italy",
"country": "Italy",
"country_code": "IT",
"country_latlong": [
42.83333333,
12.83333333
],
"province": null,
"region": "Europe",
"subregion": "Southern Europe",
"administrative": null,
"latlong": [
42.83333333,
12.83333333
]
},
"date": {
"year": 2020,
"month": 4,
"day": 1,
"date": "2020-04-01",
"timestamp": 1585702800,
"index": 71
},
"values": {
"absolute": {
"confirmed": 110574,
"deaths": 13155,
"recovered": 16847,
"infected": 80572
},
"delta": {
"confirmed": 4782,
"recovered": 1118,
"deaths": 727,
"infected": 2937
},
"delta_pct": {
"confirmed": 4.52019056261343,
"recovered": 7.107889884925933,
"infected": 3.7830875249565272,
"deaths": 5.8496942388155775
},
"peak_pct": {
"confirmed": 80.68979481641469,
"recovered": 88.23993685872139,
"deaths": 88.9405431857108,
"infected": 67.14677640603567
},
"relative": {
"deaths": 11.897010147050842,
"recovered": 15.23595058512851,
"infected": 72.86703926782064
},
"absolute_pop1k": {
"confirmed": 1.771943724385206,
"recovered": 0.26997247024361576,
"deaths": 0.2108083246901386,
"infected": 1.2911629294514517
},
"absolute_pop10k": {
"confirmed": 17.71943724385206,
"recovered": 2.6997247024361575,
"deaths": 2.108083246901386,
"infected": 12.911629294514517
},
"absolute_pop100k": {
"confirmed": 177.19437243852062,
"recovered": 26.997247024361577,
"deaths": 21.08083246901386,
"infected": 129.11629294514518
}
},
"factbook": {
"population": 62402659,
"median_age": 46.5,
"death_rate": 10.7,
"area": 301340
},
"data_key": "fc397cfe886db71b40d2baf78a4827c5",
"day_index_1": 62,
"day_index_10": 41,
"day_index_100": 39,
"day_index_1k": 33,
"day_index_10k": 23,
"day_index_peak_confirmed": 8,
"day_index_peak_deaths": 5,
"day_index_peak": 6
}
...
]
status.json
example extract{
...
"countries": {
...
"Italy": {
"dataset": "jhu/daily",
"location": {
"label": "Italy",
"type": "total-country",
"country_code": "IT",
"country": "Italy",
"province": null,
"administrative": null,
"latlong": [
42.83333333,
12.83333333
]
},
"date": "2020-04-01",
"day_index": {
"confirmed_1": 62,
"confirmed_10": 41,
"confirmed_100": 39,
"confirmed_1k": 33,
"confirmed_10k": 23,
"peak": 6,
"peak_confirmed": 8,
"peak_deaths": 5
},
"values": {
"absolute": {
"confirmed": 110574,
"deaths": 13155,
"recovered": 16847,
"infected": 80572
},
"absolute_pop100k": {
"confirmed": 177.19437243852062,
"recovered": 26.997247024361577,
"deaths": 21.08083246901386,
"infected": 129.11629294514518
},
"delta": {
"confirmed": 4782,
"recovered": 1118,
"deaths": 727,
"infected": 2937
},
"relative": {
"deaths": 11.897010147050842,
"recovered": 15.23595058512851,
"infected": 72.86703926782064
},
"peak_pct": {
"confirmed": 80.68979481641469,
"recovered": 88.23993685872139,
"deaths": 88.9405431857108,
"infected": 67.14677640603567
}
},
"factbook": {
"population": 62402659,
"median_age": 46.5,
"death_rate": 10.7,
"area": 301340
}
}
...
}
...
}
If you use any of these derived datasets, please attribute both the original dataset and my derived dataset.
Choose (and adapt if necessary) one (or more) of the following snippets depending on which derived dataset you are using:
based on original data from JHU CSSE (https://github.com/CSSEGISandData/COVID-19),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets
based on original data from ECDC (https://www.ecdc.europa.eu/),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets
based on original data from "The New York Times" (https://github.com/nytimes/covid-19-data),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets