sfbrigade / data-covid19-sfbayarea

Manual and automated processes of sourcing data for the stop-covid19-sfbayarea project
MIT License
8 stars 10 forks source link

Figure out how to model vaccination data #175

Open Mr0grog opened 3 years ago

Mr0grog commented 3 years ago

Three bay area counties now have vaccination dashboards:

We need to figure out how to best represent this data in our scraper output. What’s common between these dashboards? What’s different? What’s most important?


Updated 2021-01-22: Added Marin County Updated 2021-02-05: Added Alameda, San Mateo, Napa from @kengo-sony

Mr0grog commented 3 years ago

Marin is also now including vaccination data at the bottom of their main dashboard page: https://coronavirus.marinhhs.org/surveillance#vaccines

kengo-sony commented 3 years ago

Found three more counties that have vaccination dashboards

Mr0grog commented 3 years ago

Interesting discovery: the state is currently publishing some good data on their vaccinations page at: https://covid19.ca.gov/vaccines/. For each county, they’ve got:

Some counties are definitely not publishing these stats (some counties we haven’t identified a dashboard/dataset for at all), and this also gives us a standard set of categories. Using this state data might be the place to start for now.

Downsides:

Since we need to build the timeseries ourselves and because the state data covers a broader set of counties than we do here, I went ahead and started the ball rolling at https://github.com/Mr0grog/ca-covid-vaccination-stats. If that works out well, we should see about integrating the code (or just the data) here.

Lynguyen237 commented 3 years ago

@Mr0grog Rob, how did your attempt to build time series data from the state website go? I took a stab at documenting the metrics provided by each county here (not finished) and I think if the state already has the data and common metrics shared by all counties, that would be our best bet: https://docs.google.com/spreadsheets/d/1fhdF587nhBlychWvmMwPs2PwGy9ffqiH06ZZhEitVfo/edit#gid=262116661

Mr0grog commented 3 years ago

The state is now publishing timeseries data for all the same info I was scraping at: https://data.ca.gov/dataset/covid-19-vaccine-progress-dashboard-data

I think you should probably just use the new state dataset; a quick glance over your spreadsheet makes it look like the state data feeds or my scraper both cover all the stats you’ve listed (except neighborhood), but for all counties.