Open lossyrob opened 4 years ago
Already used in this project:
US County dataset: https://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_050_00_20m.json Census data by county: https://www2.census.gov/programs-surveys/popest/datasets/2010-2018/counties/asrh/cc-est2018-alldata.csv
Data that contribute to generating these numbers (taken from the Flu Surge 2.0 Model) are high priority:
There's a number of datasets that could be helpful with this effort. In order to most effectively keep track of them we should establish a documentation method. This could be a Google Doc or Sheet, a Markdown document, a GitHub issue - whatever makes things easy to add and gives a very quick understanding of what each dataset is at a high level, what it could possible be useful for, and whether or not it's already reviewed or used.
Also, if there's some other effort being used for this purpose, this issue could be satisfied by informing the project on how best to utilize that source.
The goal of this issue is to establish a data documentation method and communicate it to the project so that we can stay organized around the slew of data out there that may or may not be critical to the analysis.
Current data links to be catalogued: 2.1: tracking cases/testing:
2.2: epi modeling:
Healthcare facilities, beds, care utilization, provider data from national, state, county data sources:
NY:
CA:
NJ:
MA:
From Gitter:
Other data cataloging effort: https://coronavirustechhandbook.com/data