simonw / covid-19-datasette

Deploys a Datasette instance of COVID-19 data from Johns Hopkins CSSE and the New York Times
https://covid-19.datasettes.com/
61 stars 6 forks source link

Datasourse was moved to: time_series_covid19_deaths_global.csv #3

Closed mullender closed 4 years ago

mullender commented 4 years ago

The source project rearranged their csv files, and stopped updating the old ones :(

For the country data we need to source time_series_covid19_deaths_global.csv

Not sure where the US state data went.

Thanks a ton for providing this data through datasette!

simonw commented 4 years ago

Thanks for the report! Looking at that now.

simonw commented 4 years ago

Thankfully I'm not affected by changes to that file - I use the data in https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports

BUT... everything is broken right now! It looks like they changed some column headers.

simonw commented 4 years ago

I pushed the fix for #4: https://covid-19.datasettes.com/covid/daily_reports

mullender commented 4 years ago

@simonw thank you very much for updating this so quickly! The datasette reports have been most helpful in tracking the changes over time in different geographies. And estimating the effects of implementing physical distancing policies.

mullender commented 4 years ago

@simonw it looks like the auto-update of the DB failed to publish? https://github.com/simonw/covid-19-datasette/runs/531356266?check_suite_focus=true

simonw commented 4 years ago

The Cloud Run publish task only runs if the database that was built differs from the database that is currently deployed. Since the task runs four times a day now sometimes it won't deploy because the data in https://github.com/CSSEGISandData/COVID-19/commits/master/csse_covid_19_data/csse_covid_19_daily_reports hasn't changed in the past six hours.

Here's the code that compares the SHA hash of the deployed version to the SHA hash of the newly built database:

https://github.com/simonw/covid-19-datasette/blob/1aa0a5a4741b143cc4c0a6eaa5e01b81eea58c88/.github/workflows/scheduled.yml#L37-L43

I wrote a bit about how that works here: https://simonwillison.net/2020/Jan/21/github-actions-cloud-run/

mullender commented 4 years ago

That is interesting syntax, thanks for writing a blog post about it. In this case it turns out the source data itself is incorrect, not something you could do anything about.