Open krivard opened 3 years ago
do we have a good sense of how many signals we'll want and what sorts of operations we'll want to be doing to generate them (e.g. summing two columns)? I know there's still uncertainty in the exact signals, I'm mainly trying to think of the right ways to design the pipeline if we'll end up with signals requiring operations on varying column(s).
Until we can hear from Roni, probably best to use an agile approach so we can get out the minimum (ground truth) ASAP. Then as we hear more about what is needed, we can build onto that and refactor if necessary.
We are expediting creation of a new COVIDcast source+signals based on the following dataset: https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-facility
The raw data is expected to be served from its own Epidata endpoint shortly.
Data details
This data encodes hospital admissions on a per-facility basis, making it straightforward to aggregate to all COVIDcast geo levels.
Data is aggregated on a weekly basis, Friday-Thursday (note this is different from the standard epiweek definition, Sunday-Saturday)
The signals we want to compute for COVIDcast are:
previous_day_admission_adult_covid_confirmed_7_day_sum
+previous_day_admission_pediatric_covid_confirmed_7_day_sum
)