CDCgov / ww-inference-model

An in-development R package and a Bayesian hierarchical model jointly fitting multiple "local" wastewater data streams and "global" case count data to produce nowcasts and forecasts of both observations
https://cdcgov.github.io/ww-inference-model/
Apache License 2.0
17 stars 2 forks source link

Making a simulation study script for spatial ww model #157

Open cbernalz opened 2 months ago

cbernalz commented 2 months ago

Goal

We want to make a simulation study script that can be easily parallelizable.

Context

For this simulation study we want to vary the number of central Rt curves for different situations. We proceed to simulate data using each of these central Rt curves with the 3 correlation structure types, iid, exponential decay, and random correlation matrix. For each of these we then want to apply the time-series to the 3 inference models we have, that use the correlation structures : iid, exponential decay based off distances, and LKJ.

Here is the pseudo code for this :

# Prepare hard coded simulation presets, distance matrix, correlation function parameters, etc.

# Create list of simulation model, inference model, Rt curve combinations, 9*(number of Rt curves) items in this list
### In the form of the following :
### List item : corr_function | corr_func_params | corr_structure_switch |  central Rt curve
sim_inf_rt_comb_list <- list(...)

# Make empty list for results :
results <- list()

for (item in sim_inf_rt_comb_list) { # we can parallelize here
    # Simulate data

    # Fit data

    # Evaluation metrics

    # Save results in list form :
    ### Name list item based off corr_function, corr_structure_switch, central Rt curve
    ### List item : Simulation data | fit results | evaluation metrics
}

# Save results as .rda object

Requirements

kaitejohnson commented 2 months ago

Would just suggest in your results list saving some other metadata e.g,: number of sites, population coverage, generalized variance