Open jqnatividad opened 4 years ago
If this goes in I'll close my feature request. Seems to nail what I was looking for.
Hi @jqnatividad I've built some analysis getting data from the NYC dataset, but now that tables have changed it seems not to be updated anymore. Is there something in the works to fix it or should I search some other source somewhere else? Thank you very much from Italy!!!
First off, thanks JHU for exposing the data behind the dashboard. As an open data advocate, JHU's example should be encouraged and celebrated!
However, the data needs a little data-wrangling for it to be more useful for time-series analysis:
But since this is open data and open source, I decided to scratch an itch and pulled together these utilities: :)
https://github.com/dathere/covid19-time-series-utilities
Currently, there are two utilities.
covid-19_ingest.sh
: script that converts the JHU COVID-19 daily-report data to a time-series database using TimescaleDB.covid-refine
: OpenRefine automation script that converts JHU COVID-19 time-series data into a normalized, enriched format and uploads it to TimescaleDB.Here are some examples of the processed data:
A non-sparse, time-series version of JHU's time-series data with daily counts, not just running totals. https://data.beta.nyc/dataset/covid-19-time-series/resource/3d4caf81-7ec0-4112-9700-62ca7364d6bf
A location lookup table that has been geocoded to add continent, and for the US - locality, county and state by reverse geocoding the lat/long in the original feed. https://github.com/dathere/covid19-time-series-utilities/blob/master/covid19-refine/workdir/location-lookup/location-lookup.csv
Finally, here's a blogpost on the benefits of normalizing the data and feeding it to a true time-series database.
https://blog.timescale.com/blog/charting-the-spread-of-covid-19-using-timescale/