An in-development R package and a Bayesian hierarchical model jointly fitting multiple "local" wastewater data streams and "global" case count data to produce nowcasts and forecasts of both observations
We want to update the generate_simulated_data.R function in the R folder to implement a spatially correlated unadjusted reproduction number into it. We will also add necessary functions to allow different correlation structures to be used.
Context
The current model used in the generate_simulated_data.R function is listed in the model definitions in the repo. The change we are making will be at the unadjusted reproduction level. We will add the spatial correlations here with the following model :
Let $k$ represent a site, from sites $1$ to $n_k$. Let $R_k^u(t)$ be the unadjusted reproduction number at time $t$ at site $k$. So lets look at the combined "state" $R^u(t)$ that contains an AR(1) correlation structure on the first differences that is currently being used. So
Where $\vec{\delta}(t) = \left(\delta1(t),...,\delta{nk}(t)\right)^{T}$. We also have that $\vec{\delta}(t) = \varphi{R(t)}\vec{\delta}(t - 1) + \vec{\epsilon}$, where $\vec{\epsilon}$ $\sim$ $\text{Normal}_{nk}(0,\mathbf{\Sigma\{\epsilon}}).$
This adds a spatial correlation and also the current AR(1) temporal correlation to this $\vec{\delta}(\cdot)$ random variable. Like before, we have that $0 < \varphi_{R(t)} < 1$.
For $\mathbf{\Sigma{\epsilon}}$ $=$ $\sigma{\epsilon}^2\mathbf{\Omega}$, where $\Omega_{ij} \in \mathbf{\Omega}$ are build from the correlation functions. These include an exponential decay, Matern, spherical, rational quadratic, or, to return to the current model framework, independence correlation functions.
For simplicity, we will first assume an adjacency matrix is provided with the data or locations of sites where we can attain an adjacency matrix.
We need to make sure to take into account is the individuals that are not included in a site. So, currently, we will provide them with their own process. This separate process is as follows :
Lets denote the unadjusted reproduction number at time $t$ for individuals who are not represented by the actual sites as $R_{\text{aux}}^u(t)$.
This process is very similar with the previous way site level Rt was calculated, but here we include the ScalingFactor to scale the variation of the spatial process. Also, if we have an independent structure, we will not have this ScalingFactor and will instead use the $n_k + 1$ approach currently being implemented.
Requirements
[x] Add function, named spatial_rt_process.R, to deal with the new spatial unadjusted reproduction number for a general correlation function.
[x] Change generate_simulated_data.R for implementation of spatial unadjusted reproduction number, but will default to the independent unadjusted reproduction number.
[x] Add R functions that create correlation matrices for the spatial_rt_process.R function.
Goal
We want to update the
generate_simulated_data.R
function in the R folder to implement a spatially correlated unadjusted reproduction number into it. We will also add necessary functions to allow different correlation structures to be used.Context
The current model used in the
generate_simulated_data.R
function is listed in the model definitions in the repo. The change we are making will be at the unadjusted reproduction level. We will add the spatial correlations here with the following model :Let $k$ represent a site, from sites $1$ to $n_k$. Let $R_k^u(t)$ be the unadjusted reproduction number at time $t$ at site $k$. So lets look at the combined "state" $R^u(t)$ that contains an AR(1) correlation structure on the first differences that is currently being used. So
$$ \log R^u(t) \sim \text{Normal}(\log R^u(t - 1) + \beta(\log R^u(t - 1) - \log R^u(t - 2)), \sigma_r), $$
or
$$ \log R^u(t) = \log R^u(t - 1) + \beta(\log R^u(t - 1) - \log R^u(t - 2)) + \sigma_r\text{Normal}(0,1). $$
Now let $\delta_k(t)$ be the time-varying subpopulation effect on $\log R^u(t)$. So we can look at the vector of all $\log R_k^u(t)$'s, so
$$ \begin{pmatrix} \log R1^u(t)\ \vdots\ \log R{n_{k}}^u(t)\ \end{pmatrix} = \log R^u(t) \begin{pmatrix} 1\ \vdots\ 1\ \end{pmatrix} + \vec{\delta}(t). $$
Where $\vec{\delta}(t) = \left(\delta1(t),...,\delta{nk}(t)\right)^{T}$. We also have that $\vec{\delta}(t) = \varphi{R(t)}\vec{\delta}(t - 1) + \vec{\epsilon}$, where $\vec{\epsilon}$ $\sim$ $\text{Normal}_{nk}(0,\mathbf{\Sigma\{\epsilon}}).$
This adds a spatial correlation and also the current AR(1) temporal correlation to this $\vec{\delta}(\cdot)$ random variable. Like before, we have that $0 < \varphi_{R(t)} < 1$.
For $\mathbf{\Sigma{\epsilon}}$ $=$ $\sigma{\epsilon}^2\mathbf{\Omega}$, where $\Omega_{ij} \in \mathbf{\Omega}$ are build from the correlation functions. These include an exponential decay, Matern, spherical, rational quadratic, or, to return to the current model framework, independence correlation functions.
For simplicity, we will first assume an adjacency matrix is provided with the data or locations of sites where we can attain an adjacency matrix.
We need to make sure to take into account is the individuals that are not included in a site. So, currently, we will provide them with their own process. This separate process is as follows :
Lets denote the unadjusted reproduction number at time $t$ for individuals who are not represented by the actual sites as $R_{\text{aux}}^u(t)$.
$\log R_{\text{aux}}^u(t) = \log R^u(t) + \delta_{\text{aux}}(t),$
$\delta_{\text{aux}}(t) = \phi_{R(t)}\delta_{\text{aux}}(t - 1) + \epsilon_{\text{aux}},$
$\epsilon_{\text{aux}} \sim \text{Normal}(0,\text{ScalingFactor}\times\sigma_{\epsilon}),$
$\text{ScalingFactor} \sim \text{Log-Normal}(\mu_{\text{ScalingFactor}},\sigma_{\text{ScalingFactor}}^2).$
This process is very similar with the previous way site level Rt was calculated, but here we include the ScalingFactor to scale the variation of the spatial process. Also, if we have an independent structure, we will not have this ScalingFactor and will instead use the $n_k + 1$ approach currently being implemented.
Requirements
spatial_rt_process.R
, to deal with the new spatial unadjusted reproduction number for a general correlation function.generate_simulated_data.R
for implementation of spatial unadjusted reproduction number, but will default to the independent unadjusted reproduction number.spatial_rt_process.R
function.