covidatlas / coronadatascraper

COVID-19 Coronavirus data scraped from government and curated data sources.
https://coronadatascraper.com
BSD 2-Clause "Simplified" License
363 stars 179 forks source link

Add scraper for San Francisco #1011

Closed 1ec5 closed 4 years ago

1ec5 commented 4 years ago

Location name

San Francisco County, CA, USA

Source URL

https://data.sfgov.org/COVID-19/COVID-19-Cases-Summarized-by-Date-Transmission-and/tvq9-ec9w/ https://data.sfgov.org/COVID-19/COVID-19-Hospitalizations/nxjg-bhem

Notes/comments

Like the Santa Clara County Public Health Department (#965), the San Francisco Department of Public Health is now reporting a full time series by date of sample collection rather than reporting date. This means many days’ counts can be retroactively updated on any given day. They’re reporting historical data for all the counts, not just confirmed cases, though they only seem to be providing hospitalization counts up to a certain number of weeks back.

There’s already a scraper for San Francisco, but it scrapes the main dashboard in a way that’s more fragile and doesn’t reflect these retroactive updates.

Unlike Santa Clara County, this source is quite easy to work with. They provide a CSV for download or a REST API to get the table as JSON. As a point of reference, this Bash script updates a table at Wikimedia Commons (used on this Wikipedia article) with the latest data using the API. I’m not suggesting to rely on Wikimedia Commons or Wikipedia, but perhaps some of the jq invocations could be ported to JavaScript.

lazd commented 4 years ago

Oh this is great. Do you think you can pick up this issue @1ec5?

jzohrab commented 4 years ago

Done in #1022 and #1044.