Basic rejection method of inferring parameters of an SIR disease model

Summary:

I have a basic SIR model I'm trying to infer parameters for. I am using elfi rejection here, and for observed data I have an 1 dimensional array of values output from the ODE system.

Description:

The values my programme infers are not sensible results, alongside the fact that using large sample sizes utilises more memory than I am currently able to provide through my machine. For example, where the parameters should be close to around 8 and 0.4 respectively, the inferences are way off. I should note this model is deterministic and not stochastic.

Reproducible Steps:

Here is code to be reproduced easily. Up until the derivative function is created, I am just creating a simple function to compute the SIR model ODEs, that solves them. For observed data, I used the output of I, which is one of the trajectories in my model. In the derivative function, I am then telling it I want to infer beta and gamma, and resultz calculates the ODE again. I then set priors for beta and gamma, declare the elfi distance as sum of squared errors, and then run the rejection method. I also tried using narrow ranges for the priors, but that doesn't seem to do anything.

import elfi
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

seed = 20170530  # this will be separately given to ELFI
np.random.seed(seed)

# Total population, N.
N = 1
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 0.001, 0
# Everyone else, S0, is susceptible to infection initially.
U0 = N - I0 - R0
J0 = I0
Lf0, Ls0 = 0, 0
# Contact rate, beta, and mean recovery rate, gamma, (in 1/days).
beta, gamma = 8, 0.4
mu, muTB, sigma, rho = 1/80, 1/6, 1/6, 0.03
u, v, w = 0.88, 0.083, 0.0006
t = np.linspace(0, 500, 500+1)

# The SIR model differential equations.
def deriv(y, t, N, beta, gamma, mu, muTB, sigma, rho, u, v, w):
    U, Lf, Ls, I, R, cInc = y
    b = (mu * (U + Lf + Ls + R)) + (muTB * I)
    lamda = beta * I
    clamda = 0.2 * lamda
    dU = b - ((lamda + mu) * U)
    dLf = (lamda*U) + ((clamda)*(Ls + R)) - ((u + v + mu) * Lf)
    dLs = (u * Lf) - ((w + clamda + mu) * Ls)
    dI = w*Ls + v*Lf - ((gamma + muTB + sigma) * I) + (rho * R)
    dR = ((gamma + sigma) * I) - ((rho + clamda + mu) * R)
    cI = w*Ls + v*Lf + (rho * R)
    return dU, dLf, dLs, dI, dR, cI

# Integrate the SIR equations over the time grid, t.
solve = odeint(deriv, (U0, Lf0, Ls0, I0, R0, J0), t, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))
U, Lf, Ls, I, R, cInc = solve.T

#try using I as observed data
y_obs_I = np.array(I)
y_obs_I = y_obs_I.flatten()

def derivative(beta, gamma, batch_size = 1, random_state = None):

    y0 = [U0, Lf0, Ls0, I0, R0, J0]

    times = np.linspace(0, 500, 500+1)

    resultz = odeint(deriv, y0, times, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))

    return resultz.flatten()

beta_prior = elfi.Prior('uniform', 6, 9, name = 'beta')
gamma_prior = elfi.Prior('uniform', 0, 1, name = 'gamma')

Y_lv = elfi.Simulator(derivative, beta_prior, gamma_prior, observed = y_obs_I)

# Define sum of squared errors as the distance function
def SSE(x, y):
   return np.atleast_1d(np.sum((x-y)**2))
d_lv = elfi.Distance(SSE, Y_lv)

rej = elfi.Rejection(d_lv, batch_size = 1, seed = seed)
result = rej.sample(100, quantile = 0.01)

fig, ax = plt.subplots()
rej.plot_state(ax = ax)
ax.set_xlim([0, 11])
ax.set_ylim([0, 1])
plt.savefig("parameter_space.png")
plt.close()

Current Output:

The plot shows the inferences. Original values of beta 8 and gamma 0.4, so these estimates are quite inaccurate. Code output also shows extracted results as statistics:

Method: Rejection
Number of samples: 100
Number of simulations: 10000
Threshold: 1.78e+05
Parameter                Mean               2.5%              97.5%
beta:                   6.412              6.012              6.948
gamma:                  0.736              0.604              0.854

parameter_space plot showing the estimates

Expected Output:

Values closer to the true values.

ELFI Version:

0.8.4

Python Version:

Python 3.8.8

Operating System:

Windows 10

Hi! The problem here would be the dimensionality of the output of the simulator. The six time series of the parameters [U0, Lf0, Ls0, I0, R0, J0] are concatenated as one vector of length 3006. To get elfi.Rejection working, you'd need to reduce the dimensionality of the problem by decreasing the dimension using summary statistics.

In the following code,

I included elfi.vectorize() to vectorize the simulator (this doesn't give any significant performance advantages as it's basically only a for-loop wrapper).
I reformulated the code to explicitly handle elfi.Model for clarity (mainly just because it's clearer for me).
removed flatten() from simulator code so that it's easier to separate variables if time series length changes.
introduced select_var convenience function so that it is easier to try difference data modalities
introduced autocov as a summary statistic. You may want to replace elfi.Summary-nodes with your choices
I don't think you had stochasticity in the simulator? Rejection - will work ok also in this case in practice, but many LFI methods are based on the assumption of stochastic generative model.

Hope this helps!

import elfi
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

seed = 20170530  # this will be separately given to ELFI
np.random.seed(seed)

# Total population, N.
N = 1
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 0.001, 0
# Everyone else, S0, is susceptible to infection initially.
U0 = N - I0 - R0
J0 = I0
Lf0, Ls0 = 0, 0
# Contact rate, beta, and mean recovery rate, gamma, (in 1/days).
beta0, gamma0 = 8, 0.4
mu, muTB, sigma, rho = 1/80, 1/6, 1/6, 0.03
u, v, w = 0.88, 0.083, 0.0006
t = np.linspace(0, 500, 500+1)

# The SIR model differential equations.
def deriv(y, t, N, beta, gamma, mu, muTB, sigma, rho, u, v, w):
    U, Lf, Ls, I, R, cInc = y
    b = (mu * (U + Lf + Ls + R)) + (muTB * I)
    lamda = beta * I
    clamda = 0.2 * lamda
    dU = b - ((lamda + mu) * U)
    dLf = (lamda*U) + ((clamda)*(Ls + R)) - ((u + v + mu) * Lf)
    dLs = (u * Lf) - ((w + clamda + mu) * Ls)
    dI = w*Ls + v*Lf - ((gamma + muTB + sigma) * I) + (rho * R)
    dR = ((gamma + sigma) * I) - ((rho + clamda + mu) * R)
    cI = w*Ls + v*Lf + (rho * R)
    return dU, dLf, dLs, dI, dR, cI

def derivative(beta, gamma, batch_size = 1, random_state = None):

    y0 = [U0, Lf0, Ls0, I0, R0, J0]

    times = np.linspace(0, 500, 500+1)

    resultz = odeint(deriv, y0, times, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))

    return resultz

def SSE(x, y):
    return np.atleast_1d(np.sum((x-y) ** 2, axis=1))

def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]

def autocov(x, lag=1):
    x = np.atleast_2d(x)
    # In R this is normalized with x.shape[1]
    C = np.mean(x[:, lag:] * x[:, :-lag], axis=1)
    return C

vectorized_derivative = elfi.tools.vectorize(derivative)

y_obs_I = vectorized_derivative(beta0, gamma0)

model = elfi.new_model()

elfi.Prior('uniform', 6, 9, name = 'beta', model=model)
elfi.Prior('uniform', 0, 1, name = 'gamma', model=model)

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

elfi.Summary(select_var, model['Y_lv'], 3, name='I')

elfi.Summary(autocov, model['I'], 1, name='S1')

elfi.Distance(SSE, model['S1'], name='d_lv')

rej = elfi.Rejection(model['d_lv'], batch_size = 10, seed = seed)
result = rej.sample(100, quantile = 0.01)

Hello, thanks for your time on this issue. It makes sense that the dimensions were incorrect on mine, it did occur to me since I had 6 trajectories how this would fit in. From your amended code, I found that the select_var summary node works best, and produced the most accurate inference. I also scrapped the SSE distance, and used Euclidean distance instead. A couple of things I was confused about:

In the code

def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]

What is the return Xslicing in the brackets with the colon? Because later when it is used in

elfi.Summary(select_var, model['Y_lv'], 3, name='I') the arguments in the original function are not used, Xand variable=3

Also, the simulator

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

generates an array with various different values within it. Are these toy simulations created by the derivative model?

Hello, thanks for your time on this issue. It makes sense that the dimensions were incorrect on mine, it did occur to me since I had 6 trajectories how this would fit in. From your amended code, I found that the select_var summary node works best, and produced the most accurate inference. I also scrapped the SSE distance, and used Euclidean distance instead. A couple of things I was confused about:

In the code
def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]
What is the return Xslicing in the brackets with the colon? Because later when it is used in

elfi.Summary(select_var, model['Y_lv'], 3, name='I') the arguments in the original function are not used, Xand variable=3

In the definition of elfi.Summary-node here, the first input select_var is the function defining how the summary stat is calculated, the other two model['Y_lv'] and 3 are the inputs (in this order) for select_var. The reason why we're slicing X[:, :, variable] is that the output of the model['Y_lv'] (which is the simulator in this case), has the dimensions [batch_index] [time] [U, Lf, Ls, I, R, J]. Here select_var is just for clarity and is not something that you need to define.

Also, the simulator

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

generates an array with various different values within it. Are these toy simulations created by the derivative model?

This would be the simulator output of the form [batch_index] [time] [U, Lf, Ls, I, R, J] that has been generated with the inputs defined from model[beta] and model[gamma], i.e. in this case the prior nodes.

Ok that clears things up. I found that the autocovariance wasn't that good as a summary statistic as the other, which yielded much better inferences. I know you said this framework is best used for stochasticity and complex models, and I found ELFI from previous work with stochastic lotka volterra systems, the reason I attempted this on a deterministic model is because the other libraries for MCMC didn't seem to mesh well for me, but as I will be using a likelihood in the long run I will have to go back and look for others. Nevertheless, thanks for the ticket.

elfi-dev / elfi