dathere / covid19-time-series-utilities

several utilities to help wrangle COVID-19 data into a time-series format
Creative Commons Attribution Share Alike 4.0 International
34 stars 8 forks source link
covid-19 opendata opensource time-series

COVID-19 - time-series utilities

This repo contains several utilities for wrangling COVID-19 data from the John Hopkins University COVID-19 repository.

NOTE: The utilities currently do not work because of the new file formats. They will be updated shortly to work with the revised formats.

Requirements

Cloning

A note on cloning this repo, since the COVID19 directory is a git submodule:

Content

The files in this directory and how they're used:

Using the Timescale covid19-ingest script

  1. Create a TimescaleDB instance - download or signup
  2. Create a database named covid_19, and an application user covid19_user
  psql
  create database covid_19;
  create user covid19_user WITH PASSWORD 'your-password-here';
  alter database covid_19 OWNER TO covid19_user;
  \quit
  1. Run schema.sql as the covid19_user. VACUUM/ANALYZE require owner privs

    psql -U covid19_user -h <the.server.hostname> -f schema.sql covid_19

  2. Install csvkit

    • Ubuntu: sudo apt-get install csvkit
    • MacOS: Using homebrew run brew install csvkit
  3. Using a text editor, replace the environment variables for PGHOST, PGUSER and PGPASSWORD in covid-19_ingest.sh

  4. Run the script

    bash covid-19_ingest.sh

  5. (OPTIONAL) add shell script to crontab to run daily

  6. Be able to slice-and-dice the data using the full power of PostgreSQL along with Timescale's time-series capabilities!

Using COVIDrefine

NOTE: Due to the changing file format of JHU's daily report data, covid-refine is recommended over covid-19_ingest.sh. COVIDrefine has the added benefit of producing fully normalized, non-sparse, geo-enriched data.

See the detailed README.

If you just want to download the COVIDrefine data, the latest version can be found here.

Using docker-compose

  1. Remember initiate the submodule, run git submodule init
  2. Run docker-compose build
  3. Run docker-compose up
  4. That's all. You can go to Swagger or PostgREST

NOTES

TODO

Shield: CC BY-SA 4.0

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY-SA 4.0