Add an analysis of the growth rates in hospital admissions and ww data with lags

kaitejohnson commented 1 month ago

Goal

We currently have plots of the daily or weekly growth rate in the admissions and ww data, with the goal being to identify wastewater sites where there is a very poor mapping/correlation between trends in wastewater concentration and trends in hospital admissions. This is just applied to the raw data with no lags, and we are for now just visualizing them. We could additionally: 1.) Compute an R-squared for the correlation between the growth rates for each lab-site and the hospital admissions 2.) Compute an R-squared for the correlation with various lag shifts in the data.

Could then consider eventually excluding the lab-sites with very low correlations historically in the fit used to make the forecast, since this is likely going to degrade the signal rather than improve it.

See https://github.com/CDCgov/wastewater-informed-covid-forecasting/pull/167#issuecomment-2368741471 for context

SamuelBrand1 commented 1 month ago

Could then consider eventually excluding the lab-sites with very low correlations historically in the fit used to make the forecast, since this is likely going to degrade the signal rather than improve it.

I think this might the easiest approach if the goal is optimizing the forecast skill, which is implicitly the approach at the moment.

However, its worth considering other options where we are generative for the whole dataset e.g. having a bias parameter etc

Thoughts @dylanhmorris ?

SamuelBrand1 commented 1 month ago

Btw, it might turn out that different sites:lab combinations have different lag times. We should watch out for that.

kaitejohnson commented 1 month ago

Yeah I would think it makes the most sense to apply some filter on the same lag since the model can only learn one lag?

SamuelBrand1 commented 1 month ago

Hey @kaitejohnson !

Is this analysis blocked by anything (other than lots of other jobs?), I might be able to help?

kaitejohnson commented 1 month ago

It isn't blocked no I just haven't had a chance to do it! All I did so far was compute them all and make a bunch of figures which I am hoping we can discuss at today's meeting!

CDCgov / wastewater-informed-covid-forecasting

Add an analysis of the growth rates in hospital admissions and ww data with lags #169

Goal