Following the suggestion from @cpfiffer from this issue, here I'm posting a self-contained script with a complex model whose calibration with Turing I may not be able optimize properly.
Unfortunately I am not a domain expert in MCMC or ADVI, so the best I could do was to try all the techniques and gridsearching all the arguments in the tutorials. I also implemented the suggestions received in these issues:
but I think that the amount of domain expertise required to properly tune the samplers and the ADVI is probably worth a question. This issue may of course be used as an example on how to tune Turing to calibrate models with over 30 parameters and around 15 time series to reproduce.
I would then like to present here a proxy to our model, and a calibration pipeline with ( I hope!) as many details as possible already set up , so that if you think it's appropriate you may help us optimize its calibration as much as Turing allows, exploring Turing's parameters and functionalities surely better and more efficiently than what I managed to.
Our model is an age-stratified S_E_I_H_ICU_R_D epidemiological model, with 32 parameters in total. We have 5 age classes, and the data we would like to calibrate the parameters against are 5 time series of hospitalizations ( one per age class), 5 time series of ICU occupancies and 5 time series of cumulated deaths.
Here is the code:
using DifferentialEquations
using Memoization, Turing, ReverseDiff, Zygote
using DiffEqParamEstim , Optim
using Distributions
import Statistics
using Bijectors # to use the more advanced interfaces of ADVI
using Bijectors: Scale, Shift
using Plots
pyplot()
using BenchmarkTools
# # I leave these commented for now, but I noticed that performance varies largely by playing with them.
# Turing.setredcache(false)
# Turing.setadbackend(:forwarddiff)
# define the epidemiological model as a DifferentialEquations.jl. the model is wrapped by a closure that allows to specify the last 32-12 = 20 parameters values. In this way, the 32-parameters model effectively becomes a 12-parameters model. The first 20 may be initialized with a previou ssuccessful calibration. This may let one reduce calibration times when testing.
function SEIH_ICU_RD!(; calibrated_parameters::Vector{Float64})
function parameterized_SEIH_ICU_RD!(du,u,p,t)
# passed parameters
if length(calibrated_parameters) == 0
β, # the transmissibility
ϵ, # incubation period
λ_IR_1, λ_IR_2, λ_IR_3, λ_IR_4, λ_IR_5, # age-stratified delays from infected to recovered ( without going through hospitlazation)
λ_IH_1, λ_IH_2, λ_IH_3, λ_IH_4, λ_IH_5, # age-stratified delays from infected to hospitalized
λ_HICU_1, λ_HICU_2, λ_HICU_3,λ_HICU_4, λ_HICU_5, # age-stratified delays from hopsitalized to ICU
λ_HR_1, λ_HR_2, λ_HR_3, λ_HR_4, λ_HR_5, # age-stratified delays from hospitalized to recovery
λ_ICUD_1, λ_ICUD_2, λ_ICUD_3, λ_ICUD_4, λ_ICUD_5, # age-stratified delays from ICU to death
λ_ICUR_1, λ_ICUR_2, λ_ICUR_3, λ_ICUR_4, λ_ICUR_5 = p # age-stratified delays from ICU to recovery
else
β,
ϵ,
λ_IR_1, λ_IR_2, λ_IR_3, λ_IR_4, λ_IR_5,
λ_IH_1, λ_IH_2, λ_IH_3, λ_IH_4, λ_IH_5 = p
λ_HICU_1, λ_HICU_2, λ_HICU_3,λ_HICU_4, λ_HICU_5,
λ_HR_1, λ_HR_2, λ_HR_3, λ_HR_4, λ_HR_5,
λ_ICUD_1, λ_ICUD_2, λ_ICUD_3, λ_ICUD_4, λ_ICUD_5,
λ_ICUR_1, λ_ICUR_2, λ_ICUR_3, λ_ICUR_4, λ_ICUR_5 = calibrated_parameters
end
# group parameters by age classes
λ_IR = [λ_IR_1, λ_IR_2, λ_IR_3, λ_IR_4, λ_IR_5]
λ_IH = [λ_IH_1, λ_IH_2, λ_IH_3, λ_IH_4, λ_IH_5]
λ_HICU = [λ_HICU_1, λ_HICU_2, λ_HICU_3, λ_HICU_4, λ_HICU_5]
λ_HR = [λ_HR_1, λ_HR_2, λ_HR_3, λ_HR_4, λ_HR_5]
λ_ICUD = [λ_ICUD_1, λ_ICUD_2, λ_ICUD_3, λ_ICUD_4, λ_ICUD_5]
λ_ICUR = [λ_ICUR_1, λ_ICUR_2, λ_ICUR_3, λ_ICUR_4, λ_ICUR_5]
# State variables
# Susceptibles
S = @view u[5*0+1:5*1]
# Exposed
E = @view u[5*1+1:5*2]
# Infected
I = @view u[5*2+1:5*3]
# Hospitalized
H = @view u[5*3+1:5*4]
# ICU occupancy
ICU = @view u[5*4+1:5*5]
# Cumulated Recovered
R = @view u[5*5+1:5*6]
# Daily Deaths ( age disaggregated)
D = @view u[5*6+1:5*7]
# Daily Hospistal admissions ( all those who get transfered to ICU are first admitted to the hospital )
AdH = @view u[5*7+1:5*8]
# Daily ICU admissions
AdICU = @view u[5*8+1:5*9]
# State variables differentials
dS = @view du[5*0+1:5*1]
dE = @view du[5*1+1:5*2]
dI = @view du[5*2+1:5*3]
dH = @view du[5*3+1:5*4]
dICU = @view du[5*4+1:5*5]
dR = @view du[5*5+1:5*6]
dD = @view du[5*6+1:5*7]
dAdH = @view du[5*7+1:5*8]
dAdICU = @view du[5*8+1:5*9]
# Force of infection
Λ = β * μ .* [sum([C_5[i,j]*(I[j])/N_5[j] for j in 1:size(C_5)[1]]) for i in 1:size(C_5)[2]] #β
# System of equations
@. dS = - Λ*S
@. dE = Λ*S - ϵ*E
@. dI = ϵ * E - ((1 - η )*λ_IR + η* λ_IH)*I
@. dH = η* λ_IH * I - (χ * λ_HICU + ( 1- χ ) * λ_HR ) * H
@. dICU = λ_HICU * χ * H - (δ_ICU * λ_ICUD + (1 - δ_ICU) * λ_ICUR) * ICU
@. dR = (1 - η ) * λ_IR * I + (1-χ)*λ_HR*H + (1-δ_ICU)*λ_ICUR*ICU
@. dD = δ_ICU*λ_ICUD*ICU
@. dAdH = η * λ_IH * I
@. dAdICU = λ_HICU * χ * H
end
end;
Model initial conditions, contact matrix and population:
# Initial conditions.
# contact matrix
const C_5 = [2.87954 3.07131 2.21196 0.343359 0.256795;
2.93036 4.6993 5.49855 1.01668 0.233958;
1.53385 3.99626 2.88881 1.00161 0.631737;
0.597362 1.85386 2.51295 1.41536 0.943012;
0.402476 0.38432 1.42786 0.849536 7.48054e-5 ]
# age-stratified population
const N_5 = [922868, 967257, 1330871, 530458, 588825]
# number of age classes
const num_age_classes = length(N_5)
# initial infected
const I₀ = repeat([5],num_age_classes)
# initial susceptibles
const S₀ = N_5 .- I₀
# initial exposed
const E₀ = [0 for n in 1:num_age_classes]
# initial hospitalized
const H₀ = [0 for n in 1:num_age_classes]
# initial ICUs
const ICU₀ = [0 for n in 1:num_age_classes]
# initial recovered
const R₀ = [0 for n in 1:num_age_classes]
# initial deaths
const D₀ = [0 for n in 1:num_age_classes]
# initial hopsitalizal admissons
const AdH₀ = [0 for n in 1:num_age_classes]
# initial ICU admissions
const AdICU₀ = [0 for n in 1:num_age_classes]
# collect initial condition in an array for the SEIH_ICU_RD! model
const B = vcat([S₀, E₀, I₀, H₀, ICU₀, R₀, D₀, AdH₀, AdICU₀ ]...);
Model initial parameter values:
# Model intial parameter values.
# The following values have been derived from literature and consitute the initial condition for the parameters ( epidemiological delays) to be calibrated
const P = [ 0.15,
0.1724137931034483,
0.5, 0.5, 0.5, 0.5, 0.5,
0.11834005925895003, 0.11834005925895003, 0.11834005925895003, 0.12104523689190094, 0.12104523689190094,
0.3846153745721452, 0.3846153745721452, 0.3846153745721452, 0.3846153745721452, 0.3846153745721452,
0.1111111111111111, 0.1111111111111111, 0.1111111111111111, 0.1111111111111111, 0.1111111111111111,
0.08522883763258698, 0.08522883763258698, 0.08522883763258698, 0.08522883763258698, 0.08522883763258698,
0.04947445517352199, 0.04947445517352199, 0.04947445517352199, 0.04947445517352199, 0.04947445517352199 ]
#The following values have been derived from literature and consitute the initial condition for the parameters ( epidemiological delays) that must remain constant and thus they are not to be calibrated
# susceptibility
const μ = [ 0.4759758925436791 ,0.8266397761918497 ,0.8280047127031845 ,0.8108310931308417 ,0.7399999999999999]
# hospitalization rate
const η = [0.0027585284135976107 ,0.032802784575350706 ,0.10232295466653041 ,0.20404289877803708 ,0.261949855298025]
# ICU fraction
const χ = [ 0.05 ,0.053934808432505525 ,0.1401166566857344 ,0.35206205203805013 ,0.6069703305850978]
# ICU conditioned fatality rate
const δ_ICU = [0.007842851966738704, 0.018512721107622265, 0.06495525917707529, 0.1850599579504777, 0.45689350535412815];
You may check that the model compiles and runs correctly
# you may check it works
problem = ODEProblem(SEIH_ICU_RD!(calibrated_parameters = Float64[]) , B, (1.0, 120.0), P)
solution = solve(problem, Tsit5(), saveat = 1:120);
plot(solution.t, [slice[6] for slice in solution.u ])
# and you can toolktize it to get 5x better solve performance
fast_problem = ODEProblem(modelingtoolkitize(problem), B, (1.0, 120.0), P)
fast_solution = solve(fast_problem, Tsit5(), saveat = 1:120);
plot(fast_solution.t, [slice[6] for slice in fast_solution.u ])
Now to the calibration. We will show that the model is "calibratable" with these data by using Optim via DiffeentialEquations.jl. It may also be useful to get starting values for other optimization algorithms.
# prepare data ( and convert them to Int)
const optim_data = convert(Array,convert(Array, VectorOfArray([Array{Int64,1}(floor.(data)) for data in vcat(H_calibration_data, ICU_calibration_data, D_calibration_data) ]))')
# define indexes w.r.t. compare the data
const save_idxs = vcat(5*3+1:5*4, 5*4+1:5*5 ,5*6+1:5*7);
# set bounds for FminBox
const lower = [uniform.a for uniform in priors];
const upper = [uniform.b for uniform in priors];
problem = ODEProblem(SEIH_ICU_RD!(calibrated_parameters = Float64[]) , B, T, P)
fast_problem = ODEProblem(modelingtoolkitize(problem), B, T, P; jac = true, sparse = true)
cost_function = build_loss_objective(fast_problem,Tsit5(),L2Loss(t,optim_data), maxiters=5*10^7,verbose=true, save_idxs=save_idxs)
result_Optim = @time Optim.optimize( cost_function,lower,upper, P, Optim.Fminbox(BFGS()) ,Optim.Options(show_trace = false , iterations = 200,time_limit = 180)) # will take approx 1 minute and a half.
One may plot the simuated trajectories using the calibrated parameters togheter with the data used to calibrate
# get calibrated parameters
const P_optim = result_Optim.minimizer;
# get the soluton with calibrated parameters
const optim_calibrated_solution = solve(fast_problem , Tsit5(); p = P_optim , saveat = 1:30 );
# plot solution using calibrated parameters togheter with calibration points
# check that calibration was somewhat successful: please keep in mind that the data are fake, and the time series that are poorly reproduced by the model tend to have a maximum value of <10, so they are probably ignored by the loss during calibration. Anyway there are techniques to account for it, but we'll omit them here for the sake of simplicity.
plots =[]
for (i, idx) in enumerate(save_idxs)
p = plot(optim_calibrated_solution.t , [optim_calibrated_solution.u[j][idx] for j in t], size = (1400, 3000), label = "simulated")
plot!(optim_calibrated_solution.t , optim_data[i,:], seriestype = :scatter , label = "data")
push!(plots, p)
end
plot(plots..., layout = (length(plots), 1))
now let's try to calibrate using NUTS. This is the Turing model we are using:
# now let's write down the turing model we are currently using
@model function turing_model(data::Array{Int64,2} , problem::ODEProblem, priors , save_idxs::Array{Int64,1} , 𝒯::Tuple{Float64,Float64} , ::Type{T} = Float64 ) where {T}
σ ~ InverseGamma(2, 3)
p ~ arraydist(priors)
predicted::Vector{Vector{T}} = solve(problem, Tsit5(), p = p, saveat=𝒯[1]:𝒯[2], save_idxs = save_idxs).u
# when the solve doesn'y converge, predicted has length 1. We detect it, and return -Inf loglikelihood to reject the sample.
if length(predicted) == size(data, 2)
for i = 1:length(predicted)
data[:,i] ~ MvNormal(predicted[i], σ)
end
else
Turing.@addlogprob! -Inf
return
end
end
Let's calibrate the model:
# instamtiate turing model
tur_mod = turing_model(optim_data, fast_problem, priors, save_idxs, T )
# sample from posterior. It takes approx 5 minutes
chain = @time sample(tur_mod, NUTS(), 1000) # using MCMCThreads() over multilple chains seriously slows it down. Should one use MCMCDistributed() preferably?
Get the parameters and check it correctly calibrated the time series:
# get the parameters
P_turing = vcat([Statistics.mean(chain["p[$i]"]) for i in 1:32])
# evaluate the solution with such parameters
const turing_calibrated_solution = solve(fast_problem , Tsit5(); p = P_turing , saveat = 1:30 );
# plot
plots =[]
for (i, idx) in enumerate(save_idxs)
p = plot(turing_calibrated_solution.t , [turing_calibrated_solution.u[j][idx] for j in t], size = (1400, 3000), label = "simulated")
plot!(turing_calibrated_solution.t , optim_data[i,:], seriestype = :scatter , label = "data")
push!(plots, p)
end
plot(plots..., layout = (length(plots), 1))
Sample from the multivariate posteriors and plot the simulations togheter with the data
# sample from the posterior
extractions_advi = [rand(q) for i in 1:10000] ;
#evaluate the parameter values
P_advi = [Statistics.mean([extraction[i] for extraction in extractions_advi]) for i in 2:33]
# get the advi calibrated solution
const advi_calibrated_solution = solve(fast_problem, Tsit5(); p = P_advi , saveat = 1:30 );
# plot advi results
plots =[]
for (i, idx) in enumerate(save_idxs)
p = plot(advi_calibrated_solution.t , [advi_calibrated_solution.u[j][idx] for j in t], size = (1400, 3000), label = "simulated")
plot!(advi_calibrated_solution.t , optim_data[i,:], seriestype = :scatter , label = "data")
push!(plots, p)
end
plot(plots..., layout = (length(plots), 1))
So trying also other models it seems that NUTS performs generally better than ADVI, but we are far from sure.
How would you calibrate this 32-parameter model? What exact setup would you use? Could you please present your practical reasoning, so that non-domain experts may try to reproduce it in other scenarios?
Thanks in advance
EDIT1: as suggested in this comment I moved save_idxs as a keyword argument to the solve inside the turing_model.
Hello,
Following the suggestion from @cpfiffer from this issue, here I'm posting a self-contained script with a complex model whose calibration with Turing I may not be able optimize properly.
Unfortunately I am not a domain expert in MCMC or ADVI, so the best I could do was to try all the techniques and gridsearching all the arguments in the tutorials. I also implemented the suggestions received in these issues:
but I think that the amount of domain expertise required to properly tune the samplers and the ADVI is probably worth a question. This issue may of course be used as an example on how to tune Turing to calibrate models with over 30 parameters and around 15 time series to reproduce.
I would then like to present here a proxy to our model, and a calibration pipeline with ( I hope!) as many details as possible already set up , so that if you think it's appropriate you may help us optimize its calibration as much as Turing allows, exploring Turing's parameters and functionalities surely better and more efficiently than what I managed to.
Our model is an age-stratified S_E_I_H_ICU_R_D epidemiological model, with 32 parameters in total. We have 5 age classes, and the data we would like to calibrate the parameters against are 5 time series of hospitalizations ( one per age class), 5 time series of ICU occupancies and 5 time series of cumulated deaths. Here is the code:
Model initial conditions, contact matrix and population:
Model initial parameter values:
Model parameter priors
Time span and time interval we are going to calibrate and solve within:
Hard coded calibration data. Each timeseries coresponds to an age class.
You may check that the model compiles and runs correctly
And we can also optimize it via ModelingToolkit.jl:
Now to the calibration. We will show that the model is "calibratable" with these data by using Optim via DiffeentialEquations.jl. It may also be useful to get starting values for other optimization algorithms.
One may plot the simuated trajectories using the calibrated parameters togheter with the data used to calibrate
now let's try to calibrate using NUTS. This is the Turing model we are using:
Let's calibrate the model:
Get the parameters and check it correctly calibrated the time series:
We can also try ADVI
Sample from the multivariate posteriors and plot the simulations togheter with the data
So trying also other models it seems that NUTS performs generally better than ADVI, but we are far from sure.
How would you calibrate this 32-parameter model? What exact setup would you use? Could you please present your practical reasoning, so that non-domain experts may try to reproduce it in other scenarios?
Thanks in advance
EDIT1: as suggested in this comment I moved
save_idxs
as a keyword argument to thesolve
inside theturing_model
.