Open ckingbailey opened 4 years ago
I realized there's some extra complexity here coz we need two data sources: one for US and one for the world. So then we'll need two scheduled functions, one to fetch each of those data sources.
We'll probably want two buckets, too, one for each data set, US and world. In that case we'll need two transform functions, too.
Once the data is transformed into the shape we want, it can all go into one bucket, the processed-data
bucket we've already created.
How should we fire the last function: fire it on a timer, or fire it on bucket PUT?
Here's another data source https://coronadatascraper.com/timeseries.csv
What data sources do we want to use?
There are many out there. Two are
https://github.com/datasets/covid-19
and
https://github.com/CSSEGISandData/COVID-19 (from Johns Hopkins Uni)
Also the NY Times maintains a GH repo of US-only data
There's this from the Atlantic
This tweet has a few more
Which data sources we choose depends on what we're interested in
I want a world total.
I want a US total.
I want some US county- or region-level data, such as Bay Area and New York. I may want other US regions later, such as less populous states that may soon see infection rates rising.
I want certain countries. I was interested in Italy. Now I'm more interested in Spain. I'd like to see South Korea, Japan, and maybe China. I may want to keep tabs on India and Mexico in the future.