cipriancraciun / covid19-datasets

COVID-19 derived and augmented datasets (based on JHU, NY Times, ECDC) exported as JSON, TSV, SQL, SQLite DB (plus visualizations)
https://scratchpad.volution.ro/ciprian/eedf5eb117ec363ca4f88492b48dbcd3/
25 stars 5 forks source link
2019-ncov covid-19 covid-2019 data-visualization ecdc jhu-dataset ny-dataset sql sqlite

COVID-19 derived datasets (JHU, NY Times, ECDC)

Table of contents:

About

This repository contains various datasets related to COVID-19 (JHU CSSE, NY Times, ECDC):

Also some visualizations based on the derived datasets are available at:

None of these datasets were collected by me, however I have re-processed, re-formatted and augmented them for easier manipulation.

Used by

Disclaimer

As with anything on the Internet these days, I take no responsibility for anything. :)

Visualizations

I have created a few groups of countries / regions, based on the derived datasets, and for each one I've plotted all the available metrics:

absolute_pop100k--confirmed delta--confirmed delta--deaths peak--confirmed peak--deaths

Dataset sources

JHU CSSE COVID-19 dataset

NY Times COVID-19 dataset

ECDC COVID-19 dataset

Dataset example

values.json example extract

[
  ...

  {
    "dataset": "jhu/daily",
    "location": {
      "key": "fb583ceb1834efe5f595d1d7ac84a7f1",
      "type": "total-country",
      "label": "Italy",
      "country": "Italy",
      "country_code": "IT",
      "country_latlong": [
        42.83333333,
        12.83333333
      ],
      "province": null,
      "region": "Europe",
      "subregion": "Southern Europe",
      "administrative": null,
      "latlong": [
        42.83333333,
        12.83333333
      ]
    },
    "date": {
      "year": 2020,
      "month": 4,
      "day": 1,
      "date": "2020-04-01",
      "timestamp": 1585702800,
      "index": 71
    },
    "values": {
      "absolute": {
        "confirmed": 110574,
        "deaths": 13155,
        "recovered": 16847,
        "infected": 80572
      },
      "delta": {
        "confirmed": 4782,
        "recovered": 1118,
        "deaths": 727,
        "infected": 2937
      },
      "delta_pct": {
        "confirmed": 4.52019056261343,
        "recovered": 7.107889884925933,
        "infected": 3.7830875249565272,
        "deaths": 5.8496942388155775
      },
      "peak_pct": {
        "confirmed": 80.68979481641469,
        "recovered": 88.23993685872139,
        "deaths": 88.9405431857108,
        "infected": 67.14677640603567
      },
      "relative": {
        "deaths": 11.897010147050842,
        "recovered": 15.23595058512851,
        "infected": 72.86703926782064
      },
      "absolute_pop1k": {
        "confirmed": 1.771943724385206,
        "recovered": 0.26997247024361576,
        "deaths": 0.2108083246901386,
        "infected": 1.2911629294514517
      },
      "absolute_pop10k": {
        "confirmed": 17.71943724385206,
        "recovered": 2.6997247024361575,
        "deaths": 2.108083246901386,
        "infected": 12.911629294514517
      },
      "absolute_pop100k": {
        "confirmed": 177.19437243852062,
        "recovered": 26.997247024361577,
        "deaths": 21.08083246901386,
        "infected": 129.11629294514518
      }
    },
    "factbook": {
      "population": 62402659,
      "median_age": 46.5,
      "death_rate": 10.7,
      "area": 301340
    },
    "data_key": "fc397cfe886db71b40d2baf78a4827c5",
    "day_index_1": 62,
    "day_index_10": 41,
    "day_index_100": 39,
    "day_index_1k": 33,
    "day_index_10k": 23,
    "day_index_peak_confirmed": 8,
    "day_index_peak_deaths": 5,
    "day_index_peak": 6
  }

  ...
]

status.json example extract

{
  ...
  "countries": {
    ...

    "Italy": {
      "dataset": "jhu/daily",
      "location": {
        "label": "Italy",
        "type": "total-country",
        "country_code": "IT",
        "country": "Italy",
        "province": null,
        "administrative": null,
        "latlong": [
          42.83333333,
          12.83333333
        ]
      },
      "date": "2020-04-01",
      "day_index": {
        "confirmed_1": 62,
        "confirmed_10": 41,
        "confirmed_100": 39,
        "confirmed_1k": 33,
        "confirmed_10k": 23,
        "peak": 6,
        "peak_confirmed": 8,
        "peak_deaths": 5
      },
      "values": {
        "absolute": {
          "confirmed": 110574,
          "deaths": 13155,
          "recovered": 16847,
          "infected": 80572
        },
        "absolute_pop100k": {
          "confirmed": 177.19437243852062,
          "recovered": 26.997247024361577,
          "deaths": 21.08083246901386,
          "infected": 129.11629294514518
        },
        "delta": {
          "confirmed": 4782,
          "recovered": 1118,
          "deaths": 727,
          "infected": 2937
        },
        "relative": {
          "deaths": 11.897010147050842,
          "recovered": 15.23595058512851,
          "infected": 72.86703926782064
        },
        "peak_pct": {
          "confirmed": 80.68979481641469,
          "recovered": 88.23993685872139,
          "deaths": 88.9405431857108,
          "infected": 67.14677640603567
        }
      },
      "factbook": {
        "population": 62402659,
        "median_age": 46.5,
        "death_rate": 10.7,
        "area": 301340
      }
    }

    ...
  }
  ...
}

Attribution

If you use any of these derived datasets, please attribute both the original dataset and my derived dataset.

Choose (and adapt if necessary) one (or more) of the following snippets depending on which derived dataset you are using:

based on original data from JHU CSSE (https://github.com/CSSEGISandData/COVID-19),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets
based on original data from ECDC (https://www.ecdc.europa.eu/),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets
based on original data from "The New York Times" (https://github.com/nytimes/covid-19-data),
as processed and augmented at https://github.com/cipriancraciun/covid19-datasets

Licensing