CDCgov / wastewater-informed-covid-forecasting

Wastewater-informed COVID-19 forecasting models submitted to the COVID-19 Forecast Hub
https://cdcgov.github.io/wastewater-informed-covid-forecasting/
Apache License 2.0
43 stars 7 forks source link

Tweak handling of shedding distribution in code #69

Open kaitejohnson opened 3 months ago

kaitejohnson commented 3 months ago

Problem

We are treating each day's fecal shedding as a single contribution, rather than integrating over the day's smoothed continous function. This is probably an ok approximation, but could be addressed differently.

Context:

from @dylanhmorris: We'll want to make at least a verbal argument about accumulation over time since deposition. Right now we're discretizing age of infection and thus shedding on day $\tau$. That almost certainly works just fine given the coarseness of our analysis, but we'll want to explain why:

  1. We don't think we need to model RNA decay between deposition and sampling explicitly
  2. We assume there's no accumulation of RNA in the sewers between sampling times. That is, there's minimal chance that a genome is shed at t=2, is not sampled at t=2, but then is sampled at t=3. Instead, approximately everything not sampled on the day it is shed washes through or decays.
  3. We don't compute the 24 hour cumulative shedding via an integral over the kinetics function (i.e. instead of $g_k(t-\tau)f(\tau)$, we'd compute $gk(t-\tau) \int{\tau - 1}^{\tau} f(x) \mathrm{d}x$, assuming $\tau$ is measured in days).

Related: we should be very clear about how we index time of sample versus time of deposition. I've been a bit implicit above for ease of model understanding, but this would be a bad place to make an off-by-one error.

(3) we actually could do and I'm now debating with myself whether I think it'll make any difference in practice.