elfi-dev / elfi

ELFI - Engine for Likelihood-Free Inference
http://elfi.readthedocs.io
BSD 3-Clause "New" or "Revised" License
267 stars 60 forks source link

Basic rejection method of inferring parameters of an SIR disease model #457

Closed isa-a closed 1 year ago

isa-a commented 1 year ago

Summary:

I have a basic SIR model I'm trying to infer parameters for. I am using elfi rejection here, and for observed data I have an 1 dimensional array of values output from the ODE system.

Description:

The values my programme infers are not sensible results, alongside the fact that using large sample sizes utilises more memory than I am currently able to provide through my machine. For example, where the parameters should be close to around 8 and 0.4 respectively, the inferences are way off. I should note this model is deterministic and not stochastic.

Reproducible Steps:

Here is code to be reproduced easily. Up until the derivative function is created, I am just creating a simple function to compute the SIR model ODEs, that solves them. For observed data, I used the output of I, which is one of the trajectories in my model. In the derivative function, I am then telling it I want to infer beta and gamma, and resultz calculates the ODE again. I then set priors for beta and gamma, declare the elfi distance as sum of squared errors, and then run the rejection method. I also tried using narrow ranges for the priors, but that doesn't seem to do anything.

import elfi
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

seed = 20170530  # this will be separately given to ELFI
np.random.seed(seed)

# Total population, N.
N = 1
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 0.001, 0
# Everyone else, S0, is susceptible to infection initially.
U0 = N - I0 - R0
J0 = I0
Lf0, Ls0 = 0, 0
# Contact rate, beta, and mean recovery rate, gamma, (in 1/days).
beta, gamma = 8, 0.4
mu, muTB, sigma, rho = 1/80, 1/6, 1/6, 0.03
u, v, w = 0.88, 0.083, 0.0006
t = np.linspace(0, 500, 500+1)

# The SIR model differential equations.
def deriv(y, t, N, beta, gamma, mu, muTB, sigma, rho, u, v, w):
    U, Lf, Ls, I, R, cInc = y
    b = (mu * (U + Lf + Ls + R)) + (muTB * I)
    lamda = beta * I
    clamda = 0.2 * lamda
    dU = b - ((lamda + mu) * U)
    dLf = (lamda*U) + ((clamda)*(Ls + R)) - ((u + v + mu) * Lf)
    dLs = (u * Lf) - ((w + clamda + mu) * Ls)
    dI = w*Ls + v*Lf - ((gamma + muTB + sigma) * I) + (rho * R)
    dR = ((gamma + sigma) * I) - ((rho + clamda + mu) * R)
    cI = w*Ls + v*Lf + (rho * R)
    return dU, dLf, dLs, dI, dR, cI

# Integrate the SIR equations over the time grid, t.
solve = odeint(deriv, (U0, Lf0, Ls0, I0, R0, J0), t, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))
U, Lf, Ls, I, R, cInc = solve.T

#try using I as observed data
y_obs_I = np.array(I)
y_obs_I = y_obs_I.flatten()

def derivative(beta, gamma, batch_size = 1, random_state = None):

    y0 = [U0, Lf0, Ls0, I0, R0, J0]

    times = np.linspace(0, 500, 500+1)

    resultz = odeint(deriv, y0, times, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))

    return resultz.flatten()

beta_prior = elfi.Prior('uniform', 6, 9, name = 'beta')
gamma_prior = elfi.Prior('uniform', 0, 1, name = 'gamma')

Y_lv = elfi.Simulator(derivative, beta_prior, gamma_prior, observed = y_obs_I)

# Define sum of squared errors as the distance function
def SSE(x, y):
   return np.atleast_1d(np.sum((x-y)**2))
d_lv = elfi.Distance(SSE, Y_lv)

rej = elfi.Rejection(d_lv, batch_size = 1, seed = seed)
result = rej.sample(100, quantile = 0.01)

fig, ax = plt.subplots()
rej.plot_state(ax = ax)
ax.set_xlim([0, 11])
ax.set_ylim([0, 1])
plt.savefig("parameter_space.png")
plt.close()

Current Output:

The plot shows the inferences. Original values of beta 8 and gamma 0.4, so these estimates are quite inaccurate. Code output also shows extracted results as statistics:

Method: Rejection
Number of samples: 100
Number of simulations: 10000
Threshold: 1.78e+05
Parameter                Mean               2.5%              97.5%
beta:                   6.412              6.012              6.948
gamma:                  0.736              0.604              0.854

parameter_space plot showing the estimates

Expected Output:

Values closer to the true values.

ELFI Version:

0.8.4

Python Version:

Python 3.8.8

Operating System:

Windows 10

hpesonen commented 1 year ago

Hi! The problem here would be the dimensionality of the output of the simulator. The six time series of the parameters [U0, Lf0, Ls0, I0, R0, J0] are concatenated as one vector of length 3006. To get elfi.Rejection working, you'd need to reduce the dimensionality of the problem by decreasing the dimension using summary statistics.

In the following code,

Hope this helps!

import elfi
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

seed = 20170530  # this will be separately given to ELFI
np.random.seed(seed)

# Total population, N.
N = 1
# Initial number of infected and recovered individuals, I0 and R0.
I0, R0 = 0.001, 0
# Everyone else, S0, is susceptible to infection initially.
U0 = N - I0 - R0
J0 = I0
Lf0, Ls0 = 0, 0
# Contact rate, beta, and mean recovery rate, gamma, (in 1/days).
beta0, gamma0 = 8, 0.4
mu, muTB, sigma, rho = 1/80, 1/6, 1/6, 0.03
u, v, w = 0.88, 0.083, 0.0006
t = np.linspace(0, 500, 500+1)

# The SIR model differential equations.
def deriv(y, t, N, beta, gamma, mu, muTB, sigma, rho, u, v, w):
    U, Lf, Ls, I, R, cInc = y
    b = (mu * (U + Lf + Ls + R)) + (muTB * I)
    lamda = beta * I
    clamda = 0.2 * lamda
    dU = b - ((lamda + mu) * U)
    dLf = (lamda*U) + ((clamda)*(Ls + R)) - ((u + v + mu) * Lf)
    dLs = (u * Lf) - ((w + clamda + mu) * Ls)
    dI = w*Ls + v*Lf - ((gamma + muTB + sigma) * I) + (rho * R)
    dR = ((gamma + sigma) * I) - ((rho + clamda + mu) * R)
    cI = w*Ls + v*Lf + (rho * R)
    return dU, dLf, dLs, dI, dR, cI

def derivative(beta, gamma, batch_size = 1, random_state = None):

    y0 = [U0, Lf0, Ls0, I0, R0, J0]

    times = np.linspace(0, 500, 500+1)

    resultz = odeint(deriv, y0, times, args=(N, beta, gamma, mu, muTB, sigma, rho, u, v, w))

    return resultz

def SSE(x, y):
    return np.atleast_1d(np.sum((x-y) ** 2, axis=1))

def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]

def autocov(x, lag=1):
    x = np.atleast_2d(x)
    # In R this is normalized with x.shape[1]
    C = np.mean(x[:, lag:] * x[:, :-lag], axis=1)
    return C

vectorized_derivative = elfi.tools.vectorize(derivative)

y_obs_I = vectorized_derivative(beta0, gamma0)

model = elfi.new_model()

elfi.Prior('uniform', 6, 9, name = 'beta', model=model)
elfi.Prior('uniform', 0, 1, name = 'gamma', model=model)

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

elfi.Summary(select_var, model['Y_lv'], 3, name='I')

elfi.Summary(autocov, model['I'], 1, name='S1')

elfi.Distance(SSE, model['S1'], name='d_lv')

rej = elfi.Rejection(model['d_lv'], batch_size = 10, seed = seed)
result = rej.sample(100, quantile = 0.01)
isa-a commented 1 year ago

Hello, thanks for your time on this issue. It makes sense that the dimensions were incorrect on mine, it did occur to me since I had 6 trajectories how this would fit in. From your amended code, I found that the select_var summary node works best, and produced the most accurate inference. I also scrapped the SSE distance, and used Euclidean distance instead. A couple of things I was confused about:

In the code

def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]

What is the return Xslicing in the brackets with the colon? Because later when it is used in

elfi.Summary(select_var, model['Y_lv'], 3, name='I') the arguments in the original function are not used, Xand variable=3

Also, the simulator

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

generates an array with various different values within it. Are these toy simulations created by the derivative model?

hpesonen commented 1 year ago

Hello, thanks for your time on this issue. It makes sense that the dimensions were incorrect on mine, it did occur to me since I had 6 trajectories how this would fit in. From your amended code, I found that the select_var summary node works best, and produced the most accurate inference. I also scrapped the SSE distance, and used Euclidean distance instead. A couple of things I was confused about:

In the code

def select_var(X, variable=3):
    """
    variable name
       0       U0
       1      Lf0
       2      Ls0
       3       I0
       4       R0
       5       J0
    """
    return X[:,:,variable]

What is the return Xslicing in the brackets with the colon? Because later when it is used in

elfi.Summary(select_var, model['Y_lv'], 3, name='I') the arguments in the original function are not used, Xand variable=3

In the definition of elfi.Summary-node here, the first input select_var is the function defining how the summary stat is calculated, the other two model['Y_lv'] and 3 are the inputs (in this order) for select_var. The reason why we're slicing X[:, :, variable] is that the output of the model['Y_lv'] (which is the simulator in this case), has the dimensions [batch_index] [time] [U, Lf, Ls, I, R, J]. Here select_var is just for clarity and is not something that you need to define.

Also, the simulator

elfi.Simulator(vectorized_derivative, model['beta'], model['gamma'], observed = y_obs_I, name='Y_lv')

generates an array with various different values within it. Are these toy simulations created by the derivative model?

This would be the simulator output of the form [batch_index] [time] [U, Lf, Ls, I, R, J] that has been generated with the inputs defined from model[beta] and model[gamma], i.e. in this case the prior nodes.

isa-a commented 1 year ago

Ok that clears things up. I found that the autocovariance wasn't that good as a summary statistic as the other, which yielded much better inferences. I know you said this framework is best used for stochasticity and complex models, and I found ELFI from previous work with stochastic lotka volterra systems, the reason I attempted this on a deterministic model is because the other libraries for MCMC didn't seem to mesh well for me, but as I will be using a likelihood in the long run I will have to go back and look for others. Nevertheless, thanks for the ticket.