GoogleCloudPlatform / covid-19-open-data

Datasets of daily time-series data related to COVID-19 for over 20,000 distinct locations around the world.
Apache License 2.0
471 stars 130 forks source link

Calculate US State Aggregations of Static Covariates #498

Open gserapio opened 3 years ago

gserapio commented 3 years ago

This is a stretch of an enhancement request, but I thought I'd ask!

The United States is rich with COVID-19 data that is segmented by state, but many basic static covariates provided in the Demographics, Economy, and Health tables are unavailable at the US state-level. Having these variables on hand could significantly enhance future research looking more closely at COVID-19 in the US. The following variables by US state (aggregation_level == 1) seem derivable from data already included in this repo or available from an external source:

Demographics Table

Economy Table

Health Table

One might also be able to further estimate a subset of the population and health variables for US counties, though that would be a greater undertaking.

owahltinez commented 3 years ago

Thanks for the suggestion. This is certainly feasible, and we already try to source as many of these variables as we can for subnational locations — mainly using Wikidata.org as the source.

We can certainly add the globaldatalab.org data source for the state-level HDI variable. For the other variables, do you have a specific data source in mind?

gserapio commented 2 years ago

Great!

Demographics Table

I'm not sure how population_rural vs population_urban has been defined across different countries on your team's end, but I think this data at the state level should be available from the US Census. I've had difficulty finding a table for 2019 on their website, however. It probably can be extracted from raw American Community Survey (ACS) data.

Economy Table

I was able to find GDP at the state level on Wikidata.org, citing this Wikipedia page, citing the Bureau of Economic Analysis.

Health Table

Adult cigarette use by US state: https://www.cdc.gov/statesystem/cigaretteuseadult.html Countyhealthrankings.org from the University of Wisconsin seems to have most of the Health Table variables available by US state for 2019.