In this scenario, we are interested in determining the effects
of masking and social distancing on Covid-19 infections using simulated data. The
simulations use contact matrices and populations subdivided into three age
groups. The data are generated from an SEIR model.
In these questions, we provide the contact matrices and
population data as well as the outputs of the simulated SEIR model. We ask you
to calibrate a model, compute $\beta$ at
different time intervals, and to estimate the causal effects of interventions.
The model can be described with the diagram described in Figure 2, and the following set of ordinary differential equations:
Figure 2. Model structure for Scenario 3: Causal
Analysis with Interventions
Supplementary files population.csv and ContactMatrix.csv
contain data on the population counts and the contact matrix for each of the
three age groups. The output data provided has counts for S, E, I, and R for
each of the three age strata. The output files are called S3SimulationRuns.csv
and S3SimulationRuns.RDS. These have the same information but in a
slightly different format.
The contact matrix, M, is provided in the supplementary file,
but is also written below in Table 1.
Table 1. Contact matrix for Scenario 3: Causal Analysis with Interventions. Units are average number of contacts
per day.
Age Group 1
Age Group 2
Age Group 3
Age Group 1
38.62
20.56
6.12
Age Group 2
20.56
28.22
11.60
Age Group 3
6.12
11.60
20.01
In the simulation, two interventions happen simultaneously:
Masking
·From t = 0 to t = 50 days, no masking occurs.
·From t = 50 to t = 100 days, some masking happens ($m_{cw}=0.5, m_{ew} = 0.6$)
and spread of Covid decreases.
·From t = 100 to t = 150 days, masking still happens ($m_{cw}=0.4, m_{ew}=0.2$but
with less intensity.
Social distancing
·From t = 0 to t = 20 days, no social distancing occurs.
·From t = 20 to t = 80 days, social distancing happens, reducing
contact rates to 30% of their original values across the board.
·From t = 80 to t = 150 days, social distancing happens, reducing
contact rates to 80% of their original values across the board.
This simulation is deterministic, but we draw $\beta$ randomly
from a distribution and run the simulation 25 times with slightly different
values of $\beta$.
This is intended to allow us to ask questions about uncertainty.
Figure 3.
All twenty-five runs of the simulation stratified by age group
1.Model Extraction
(see S1Q1 for definition of model extraction): Extract the model and set
default parameters and initial conditions. For now, use a dummy value for $\beta$.
Note the time to extract the model and get it into an executable state that can
run a simple test simulation and get sensible results. For workbench modelers,
model extraction time may include human-in-the-loop curation, and for baseline
modelers, this time may include debugging code. Provide simulation results from
your test simulation.
2.Model
Calibration:
·Calibrate the model to estimate $\beta$ in
all 25 runs of data provided. Since all runs were generated using a different
value of $\beta$each
value of $\beta$ should
be a little different. Save the values of $\beta$ for
use in Q5 and Q6.
·Average all 25 runs together and calibrate a model to estimate $\beta$ using
the averaged data. Use this calibrated model and averaged data for Q3 and Q4.
3.Causal
Effects: Estimate the average treatment effects on infections, for each of
the last four unique intervals, using the following approach:
·To estimate the average treatment effect (ATE) for the nth
time interval, use your calibrated model from Q2b, parameterized only for the (n-1)th
time interval, and generate a forecast of the model over the nth
interval, where no change in interventions take place (you assume the
interventions in place in the (n-1)th interval continue uninterrupted). Compute
the root mean squared error (RMSE) between the model forecast of infections in
the nth interval and the average of the supplementary data for the nth
interval. This is the ATE for the nth interval.
For example, calibrate a model using
data from time interval (0, 20), and simulate the model over the interval (20, 50).
Compare the simulated output over the interval (20, 50) to the average of the provided
data for the interval (20, 50) to estimate the ATE of the set of interventions in
the interval (20, 50) (as defined originally in the scenario background), on
infections.
·For each interval you calculate ATE for, generate plots comparing
the actual data (all compartments) to the forecasted output had there been no
change in interventions.
·Include uncertainty in the estimated effects.
4.Interventions:
Use your fitted model from Q2b to conduct an approximation of a sensitivity
analysis.
·Change the original reduction in contact matrix at t = 20 days to
40% of the original value. How does that affect infections at t = 50 days? Calculate
ATE for this change in reduction.
·Repeat Q4a but change the reduction in contact matrix to 20% of
the original value. Show the change using plots and changes in calculated ATE.
·(Optional) Change the reduction in contact matrix in other ways
(e.g., instead of changing from a 30% decrease to 40% decrease, change the 30% decrease
to 50% decrease), or by changing which age groups have a reduction in contact
rate, to demonstrate how various types and levels of contact reduction can
affect outcomes.
5.Intervention
Optimization: In this question, we will ask you to find the minimum level
of mask efficacy needed to ensure that the maximum number of infections in the
most populous age group (I2) is below 5,000,000 people, with 90%
confidence.
Without knowledge of the exact distribution
from which $\beta$ is
drawn from, but with simulated data provided, one way to approach this is with
the following steps:
a.Examine the values of $\beta$ from
Q2 and fit a distribution to these values. Use this approximate distribution to
calculate a $\beta$ you
can use to represent the appropriate quantile for the confidence level.
b.Using the value of $\beta$ you
calculated in Q5a, determine the minimum level of mask efficacy needed to
ensure that the maximum number of infections in I2 remains below
5,000,000 people. Demonstrate this with a plot of simulation outcomes.
6.Intervention
Optimization: What is the latest time the first masking intervention
(currently at t = 50 days) can start to keep total infections below 11,000,000
people at any point in time in the simulation, with 95% confidence? Assume
nothing else in the original simulation specification changes. You can apply a
similar procedure to the one in Q5 (find the right $\beta$ from
a fitted distribution, and then optimize over the parameter of interest) to
solve this question. Demonstrate your answer with a plot of simulation
outcomes.
7.(Optional) In this question, we use the original SEIR
model defined in the scenario introduction, but with no masking or social
distancing interventions. Instead, we provide data generated from an SEIR model
where $\beta$ varies
at every time step over the course of the simulation. The data are in the file [`ChangingBeta.csv`](https://raw.githubusercontent.com/DARPA-ASKEM/program-milestones/main/18-month-milestone/evaluation/Epi%20Use%20Case/Scenario%203%20Supplementary/ChangingBeta.csv).
a.Using the original social distancing matrix, configure
the SEIR model in 3 different ways using the following values of $\beta$: 0.10, 0.13, and 0.16. Keep all other parameters the
same (aside from intervention parameters, which are set to 0). Calibrate an
ensemble model using the 3 model configurations and the provided data. Compute
RMSE between your calibrated ensemble model (infections variable output) and the
true infections output in the data provided.
b.Similarly, calibrate a single SEIR model to the
simulated data and compute RMSE. This model should have one constant value of $\beta$. Compare the calibrated ensemble output from Q7a to
the single model calibrated output. Plot both against the true data to
demonstrate goodness-of-fit of the calibration.
·Iterate/curate extraction and execute model until a
test simulation gives reasonable results
·Extracted models grounded with all variables and
parameters defined, and with units
·Test simulation plot
·Time to do model extraction
·Time to execute extracted model
and plot results
Q2
Simulated data
Calibrate model to data
Calibrated
$\beta$ values, and single calibrated model using averaged
data
Q3
Calibrated model from Q2b
·Estimate average treatment effects of interventions,
with uncertainty
·Plot data against counterfactuals
·Estimated ATE values with uncertainty
·Plots showing counterfactual scenarios
Q4
Calibrated model from Q2b
Implement changes in
contact matrix
·Plots showing how changing the contact matrix
affects the output
·Values for average treatment
effect
Q5
Simulated data
Conduct optimization for
the time of the first masking intervention
A plot showing that
infections can be kept below 5 million on any given day for a particular
value of mask efficacy
Q6
Simulated data
Conduct optimization for minimum mask efficacy
A plot showing when
interventions need to start to keep total infections below 11 million on any
given day.
Q7
Simulated data using a changing
value of beta
·Create an ensemble model with 3 configurations of
the same model
·Calibrate ensemble model to the provided data
·Compute accuracy (RMSE) of the single model and the
ensemble as compared to the true simulated data
Plots and RMSE calculations
showing how well the calibrated single model and the ensemble models fit the
simulated data
Decision-maker
Panel Questions
1.What is your confidence in
understanding model results and tradeoff between potential interventions?
Select score on a 7-point scale.
1.Very Low
2.Low
3.Somewhat Low
4.Neutral
5.Somewhat High
6.High
7.Very High
Explanation: Determine your confidence in being able to assess
effectiveness of all interventions considered in the scenario and understand
how uncertainty factors into results.
The
decision-maker confidence score should be supported by the answers to the
following questions:
·Do you understand the effects of interventions on trajectories?
Was the effectiveness of interventions communicated?
·Is it clear how to interpret uncertainty in the results? Do you
understand the key drivers of uncertainty in the results?
·Did models help you to understand what would have happened had a
different course of action been taken in the past? How confident are you that
the counterfactual analysis correctly explained what would have happened had a
different course of action been taken?
·How confident are you that the analysis correctly identified and
attributed responsibility to causal drivers in the scenario?
Scenario 3: Causal Analysis with Interventions
Estimated % of time: Baseline 30%; Workbench 20%
In this scenario, we are interested in determining the effects of masking and social distancing on Covid-19 infections using simulated data. The simulations use contact matrices and populations subdivided into three age groups. The data are generated from an SEIR model.
In these questions, we provide the contact matrices and population data as well as the outputs of the simulated SEIR model. We ask you to calibrate a model, compute $\beta$ at different time intervals, and to estimate the causal effects of interventions.
The model can be described with the diagram described in Figure 2, and the following set of ordinary differential equations:
Figure 2. Model structure for Scenario 3: Causal Analysis with Interventions
$\frac{dS_i}{dt}= -\beta\cdot \frac{S_i}{N}*(1-m_{ew}m_{cw} )\sum_{j=1}M_{ijw}I_j$
$\frac{dE_i}{dt}=\beta\frac{S_i}{N}(1-m_{ew}m_{cw} )(\sum_{j=1}M_{ijw}I_j )-r_{E\rightarrow I}E_i$
$\frac{dI_i}{dt}=r_{E\rightarrow I}E_i- r_{I\rightarrow R}I_i$
$\frac{dR_i}{dt}=r_{I\rightarrow R}{I_i}$
The above equations include the following constant parameters:
- $r_{E\rightarrow I}$, the rate of transition from compartment E to I = 0.08/day
- $r_{I\rightarrow R}$, the rate of transition from compartment I to R = 0.06/day
- $\beta$, which we ask you to estimate
And three parameters which change over time:
- $m_{cw}$ is mask compliance over interval w
- $m_{ew}$ is mask efficacy over interval w
- $M_{ijw}$ is the value of the contact matrix for row i and column j (from age group i to age group j) during time interval w
Use the following initial conditions (all units are number of people):
<table class=MsoTableGrid border=1 cellspacing=0 cellpadding=0 style='border-collapse:collapse;border:none'>
S1
S2
S3
E1, E2, and E3
I1, I2, and I3
R1, R2, and R3
10305660
15281905
12154442
50
50
0
Supplementary files
population.csv
andContactMatrix.csv
contain data on the population counts and the contact matrix for each of the three age groups. The output data provided has counts for S, E, I, and R for each of the three age strata. The output files are calledS3SimulationRuns.csv
andS3SimulationRuns.RDS
. These have the same information but in a slightly different format.The contact matrix, M, is provided in the supplementary file, but is also written below in Table 1.
Table 1. Contact matrix for Scenario 3: Causal Analysis with Interventions. Units are average number of contacts per day.
Age Group 1
Age Group 2
Age Group 3
Age Group 1
38.62
20.56
6.12
Age Group 2
20.56
28.22
11.60
Age Group 3
6.12
11.60
20.01
In the simulation, two interventions happen simultaneously:
Masking
· From t = 0 to t = 50 days, no masking occurs.
· From t = 50 to t = 100 days, some masking happens ($m_{cw}=0.5, m_{ew} = 0.6$) and spread of Covid decreases.
· From t = 100 to t = 150 days, masking still happens ($m_{cw}=0.4, m_{ew}=0.2$but with less intensity.
Social distancing
· From t = 0 to t = 20 days, no social distancing occurs.
· From t = 20 to t = 80 days, social distancing happens, reducing contact rates to 30% of their original values across the board.
· From t = 80 to t = 150 days, social distancing happens, reducing contact rates to 80% of their original values across the board.
This simulation is deterministic, but we draw $\beta$ randomly from a distribution and run the simulation 25 times with slightly different values of $\beta$. This is intended to allow us to ask questions about uncertainty.
Figure 3. All twenty-five runs of the simulation stratified by age group
1. Model Extraction (see S1Q1 for definition of model extraction): Extract the model and set default parameters and initial conditions. For now, use a dummy value for $\beta$. Note the time to extract the model and get it into an executable state that can run a simple test simulation and get sensible results. For workbench modelers, model extraction time may include human-in-the-loop curation, and for baseline modelers, this time may include debugging code. Provide simulation results from your test simulation.
2. Model Calibration:
· Calibrate the model to estimate $\beta$ in all 25 runs of data provided. Since all runs were generated using a different value of $\beta$each value of $\beta$ should be a little different. Save the values of $\beta$ for use in Q5 and Q6.
· Average all 25 runs together and calibrate a model to estimate $\beta$ using the averaged data. Use this calibrated model and averaged data for Q3 and Q4.
3. Causal Effects: Estimate the average treatment effects on infections, for each of the last four unique intervals, using the following approach:
· To estimate the average treatment effect (ATE) for the nth time interval, use your calibrated model from Q2b, parameterized only for the (n-1)th time interval, and generate a forecast of the model over the nth interval, where no change in interventions take place (you assume the interventions in place in the (n-1)th interval continue uninterrupted). Compute the root mean squared error (RMSE) between the model forecast of infections in the nth interval and the average of the supplementary data for the nth interval. This is the ATE for the nth interval.
For example, calibrate a model using data from time interval (0, 20), and simulate the model over the interval (20, 50). Compare the simulated output over the interval (20, 50) to the average of the provided data for the interval (20, 50) to estimate the ATE of the set of interventions in the interval (20, 50) (as defined originally in the scenario background), on infections.
· For each interval you calculate ATE for, generate plots comparing the actual data (all compartments) to the forecasted output had there been no change in interventions.
· Include uncertainty in the estimated effects.
4. Interventions: Use your fitted model from Q2b to conduct an approximation of a sensitivity analysis.
· Change the original reduction in contact matrix at t = 20 days to 40% of the original value. How does that affect infections at t = 50 days? Calculate ATE for this change in reduction.
· Repeat Q4a but change the reduction in contact matrix to 20% of the original value. Show the change using plots and changes in calculated ATE.
· (Optional) Change the reduction in contact matrix in other ways (e.g., instead of changing from a 30% decrease to 40% decrease, change the 30% decrease to 50% decrease), or by changing which age groups have a reduction in contact rate, to demonstrate how various types and levels of contact reduction can affect outcomes.
5. Intervention Optimization: In this question, we will ask you to find the minimum level of mask efficacy needed to ensure that the maximum number of infections in the most populous age group (I2) is below 5,000,000 people, with 90% confidence.
Without knowledge of the exact distribution from which $\beta$ is drawn from, but with simulated data provided, one way to approach this is with the following steps:
a. Examine the values of $\beta$ from Q2 and fit a distribution to these values. Use this approximate distribution to calculate a $\beta$ you can use to represent the appropriate quantile for the confidence level.
b. Using the value of $\beta$ you calculated in Q5a, determine the minimum level of mask efficacy needed to ensure that the maximum number of infections in I2 remains below 5,000,000 people. Demonstrate this with a plot of simulation outcomes.
6. Intervention Optimization: What is the latest time the first masking intervention (currently at t = 50 days) can start to keep total infections below 11,000,000 people at any point in time in the simulation, with 95% confidence? Assume nothing else in the original simulation specification changes. You can apply a similar procedure to the one in Q5 (find the right $\beta$ from a fitted distribution, and then optimize over the parameter of interest) to solve this question. Demonstrate your answer with a plot of simulation outcomes.
7. (Optional) In this question, we use the original SEIR model defined in the scenario introduction, but with no masking or social distancing interventions. Instead, we provide data generated from an SEIR model where $\beta$ varies at every time step over the course of the simulation. The data are in the file [`ChangingBeta.csv`](https://raw.githubusercontent.com/DARPA-ASKEM/program-milestones/main/18-month-milestone/evaluation/Epi%20Use%20Case/Scenario%203%20Supplementary/ChangingBeta.csv).
a. Using the original social distancing matrix, configure the SEIR model in 3 different ways using the following values of $\beta$: 0.10, 0.13, and 0.16. Keep all other parameters the same (aside from intervention parameters, which are set to 0). Calibrate an ensemble model using the 3 model configurations and the provided data. Compute RMSE between your calibrated ensemble model (infections variable output) and the true infections output in the data provided.
b. Similarly, calibrate a single SEIR model to the simulated data and compute RMSE. This model should have one constant value of $\beta$. Compare the calibrated ensemble output from Q7a to the single model calibrated output. Plot both against the true data to demonstrate goodness-of-fit of the calibration.
<br clear=all style='page-break-before:always'>
Scenario 3 Summary Table
<table class=MsoTableGrid border=1 cellspacing=0 cellpadding=0 width=672 style='width:503.75pt;border-collapse:collapse;border:none'>
Question
Inputs
Tasks
Outputs
Q1
Model description
· Extract equations
· Extract parameter values
· Iterate/curate extraction and execute model until a test simulation gives reasonable results
· Extracted models grounded with all variables and parameters defined, and with units
· Test simulation plot
· Time to do model extraction
· Time to execute extracted model and plot results
Q2
Simulated data
Calibrate model to data
Calibrated $\beta$ values, and single calibrated model using averaged data
Q3
Calibrated model from Q2b
· Estimate average treatment effects of interventions, with uncertainty
· Plot data against counterfactuals
· Estimated ATE values with uncertainty
· Plots showing counterfactual scenarios
Q4
Calibrated model from Q2b
Implement changes in contact matrix
· Plots showing how changing the contact matrix affects the output
· Values for average treatment effect
Q5
Simulated data
Conduct optimization for the time of the first masking intervention
A plot showing that infections can be kept below 5 million on any given day for a particular value of mask efficacy
Q6
Simulated data
Conduct optimization for minimum mask efficacy
A plot showing when interventions need to start to keep total infections below 11 million on any given day.
Q7
Simulated data using a changing value of beta
· Create an ensemble model with 3 configurations of the same model
· Calibrate ensemble model to the provided data
· Compute accuracy (RMSE) of the single model and the ensemble as compared to the true simulated data
Plots and RMSE calculations showing how well the calibrated single model and the ensemble models fit the simulated data
Decision-maker Panel Questions
1. What is your confidence in understanding model results and tradeoff between potential interventions? Select score on a 7-point scale.
1. Very Low
2. Low
3. Somewhat Low
4. Neutral
5. Somewhat High
6. High
7. Very High
Explanation: Determine your confidence in being able to assess effectiveness of all interventions considered in the scenario and understand how uncertainty factors into results.
The decision-maker confidence score should be supported by the answers to the following questions:
· Do you understand the effects of interventions on trajectories? Was the effectiveness of interventions communicated?
· Is it clear how to interpret uncertainty in the results? Do you understand the key drivers of uncertainty in the results?
· Did models help you to understand what would have happened had a different course of action been taken in the past? How confident are you that the counterfactual analysis correctly explained what would have happened had a different course of action been taken?
· How confident are you that the analysis correctly identified and attributed responsibility to causal drivers in the scenario?
<br clear=all style='page-break-before:always'>