MIF unable to fit to the data

Hi,

I am working on fitting an SEIR model to my data. I have two observed variables (V and Y) and I add up their log-likelihoods to get the total likelihood. The model works pretty smoothly. I can get reasonable simulations with some good guesses and particle filtering works fine too. I did a search over 2400 parameters sets (with 400 mif interactions and decreasing cooling factor). However, the optimal parameters generate way too off simulations (comparing with data) compared to my guess parameters and nevertheless, they have much higher log-likelihoods. I expected that parameters that generate simulations closer to the data result in higher log-likelihoods but it is not the case for my code. I was not sure where I am going wrong. I appreciate your feedback on this.

I also tried keeping the std of V fixed, scaling it with the sqrt(mu_V), or scaling it linearly with mean and those did not solve the issue.

Model Components

#################################################
#--------------------------|  rproc  |----------------------------------#
#################################################
rproc <- Csnippet("
                  double beta, foi, dw, births, mu_SE;

                  //we only consider those that participate in the epidemic:
                  double pop = S + E + I + R;

                  // transmission rate
                  beta = b0*(1+phi*sdmm);

                  // expected force of infection. iota: imported infections
                  // alpha mixing parameter, = 1:homogeneous mixing
                  foi = beta*pow(I+iota, alpha)/pop;

                  // white noise (extrademographic stochasticity)
                  dw = rgammawn(sigmaSE,dt);

                  mu_SE = foi*dw/dt;  // stochastic force of infection

                  // Poisson births: fraction of leak into S from N
                  births = rpois(br*dt*(1+omega*event));

                  // State Updates:
                  double dN_SE  = rbinom(S , 1-exp(-mu_SE  *dt));
                  double dN_EI  = rbinom(E , 1-exp(-mu_EI  *dt));
                  double dN_IR  = rbinom(I , 1-exp(-mu_IR  *dt));
                  S += births - dN_SE;
                  E += dN_SE  - dN_EI;
                  I += dN_EI  - dN_IR;
                  R += dN_IR;
                  W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
                  ")

################################################
#--------------------------|  rinit  |----------------------------------#
################################################
rinit <- Csnippet("
                  double m = eta*N;
                  S = nearbyint(m*S_0);
                  E = nearbyint(m*E_0);
                  I = nearbyint(m*I_0);
                  R = nearbyint(m*R_0);
                  W = 0;
                  ")

##################################################
#--------------------------|  dmeas  |----------------------------------#
##################################################
dmeas <- Csnippet("
                  // Model for Viral Load
                  double shed_cases = E + I;
                  double mu_V = rho_V*shed_cases;
                  double std_V = sqrt(mu_V*(1+od_V));
                  double lik_V = dnorm(V, mu_V, std_V, 1);

                  // Model for Case Counts
                  double mu_Y = rho_Y*I;
                  double lik_Y = dnbinom_mu(Y, od_Y, mu_Y, 1);

                  // Combined likelihood
                  lik = lik_V + lik_Y;
                  lik = (give_log) ? lik : exp(lik);

                  ")
##################################################
#--------------------------|  rmeas  |----------------------------------#
##################################################
rmeas <- Csnippet("
                  // Viral Load
                  double shed_cases = E + I;
                  double mu_V = rho_V*shed_cases;
                  double std_V = sqrt(mu_V*(1+od_V));
                  V = rnorm(mu_V, std_V);

                  // Case Counts
                  double mu_Y = rho_Y*I;
                  Y = rnbinom_mu(od_Y, mu_Y);

                  ")

I have attached the images of simulations with optimal parameters, simulations with my guess parameters, and local search trajectories from my guess parameters.

Local-Search Opt-Params-Simul Guess-Params-Simul

@mehrdadfazli: thanks for the very carefully prepared Issue. I have looked over your code and do not see any obvious problems with it. I suppose sdmm is some covariate, which you assume modulates the transmission rate? I see you are assuming that the incubation period and the infectious period are known (6 d & 8 d, respectively). Curiously, you are interested in estimating the birth rate from the data; I would imagine that parameter would be known with much more precision than either mu_IR or mu_EI. Interestingly, I see you are assuming also that the reporting efficiency rho_Y is known to be 1/7, though you are willing to entertain overdispersion with respect to a binomial assumption for that variable. Out of curiosity, what is your interpretation of the alpha parameter?

Looking at your MIF iterations, I see pronounced secular trends in rho_V and alpha and, of course, a very large increase in the log likelihood relative to the starting parameters. I am not quite sure how to read the last four of your plots. What do the colors represent? The black and blue lines appear to be of a different character from the others. Can you explain these plots please?

Thank you for your prompt response Dr. King. Regarding the first part of your question, we are not assuming actual birth or death. The birth rate here is taken to be the number of individuals that come to the S pool (ex. people who have been practicing staying at home so far but cannot keep it up for some reason and enter the S pool).

In the case of Y I do not think we necessarily need overdispersion, but I tried Poisson too and there was no change in the optimization results.

I consider alpha to be a mixing parameter, similar to some of your works, that accounts for nonhomogenous mixing of the population.

Sorry that I forgot to explain the last four plots. Blue line is the mean simulation of 1000 particles and black line is the data. The first two plots are the model with MLE and that last two are the simulations using my own guess.

I appreciate any help either scientifically or algorithmically.

Excellent. Thanks for the clarification. So we have established that the model makes sense as an expression of the questions you are asking and that the implementation is an accurate reflection of the model. Now let us try to figure out why the MLE seems so strange to you. I agree, it does seem puzzling that the simulations at the starting parameter at least have the same shape as the data, and some similar features, while the simulations at the MLE, corresponding to much higher likelihood make the data look very implausible. The fact that the likelihood at the start parameters is so much lower than it is at the MLE means that the model has a different perspective than you do: it views the data as being even more implausible as a realization of the model at the starting parameters than it is at the MLE. Clearly the perspective of the model differs from your own perspective. This suggests that the path to understanding what is going on lies through trying to see the model's perspective, so that you can understand how it differs from your own. Then, presumably, one of you will change your mind.

To do this, it helps to think about how one interprets a plot. When we look at the plots you made, we are judging vertical distances, specifically distances among simulations and between simulations and data. In particular, you have plotted the data and model realizations on the natural, or arithmetic, scale. That is, a difference of 50 units is the same anywhere on the graph. The model, by contrast, is looking at the data on the likelihood scale, i.e., in terms of how unlikely errors of a particular magnitude are. A difference of 50 units is much more likely in some places than it is in others. This suggests that we attempt to plot data and simulations on a scale that more closely resembles the way the model looks at the data.

For your V data, the error is assumed to be proportional to the mean. The corresponding variance stabilizing transformation is the logarithm. For the Y data, assuming the Poisson model, the variance stabilizing transformation is the square root. Therefore I suggest we replot your four figures so that we are looking at log V and sqrt(Y) for both data and simulations.

By the way, the blue line (mean of model simulations) is not really relevant here. Neither you nor the model view the data as the mean of many realizations. The data are seen, rather, as a single realization. We can drop the blue line.

Thank you Dr. King for your thorough explanation and sorry for my late reply. I have tweaked my model a little bit. First of all, I included both covariates in transmission rate (previously events was included in the birth rate). Also, to make the model focus more on Y and penalize parameters that yield simulations that are far from Y I got rid of the overdispersion in Y and I used Poisson distribution. These two led to a better fit by the IF which is shown in the attached photos. The MLE model now is able to closely replicate the data for V. However, simulations of Y are still far from data. This is the result of searching 2000 points in parameter space (10 parameters). Do you think we may need to search more? Or Is there a way to make the model follow Y more closely even at the cost of losing some accuracy is following V.

Corrected Scale: Capture

Normal Scale: Capture2

My updated code:

#########################################################################
#--------------------------|  rproc  |----------------------------------#
#########################################################################
rproc <- Csnippet("
                  double beta, foi, dw, births, mu_SE;

                  // we only consider those that participate in the epidemic:
                  double pop = S + E + I + R;

                  // transmission rate
                  beta = b0*(1+phi*sdmm)*(1+omega*event);

                  // expected force of infection. iota: imported infections
                  // alpha mixing parameter, = 1:homogeneous mixing
                  foi = beta*pow(I+iota, alpha)/pop;

                  // white noise (extrademographic stochasticity)
                  dw = rgammawn(sigmaSE,dt);

                  mu_SE = foi*dw/dt;  // stochastic force of infection

                  // Poisson births: fraction of leak into S from N
                  births = rpois(br*dt);

                  // State Updates:
                  double dN_SE  = rbinom(S , 1-exp(-mu_SE  *dt));
                  double dN_EI  = rbinom(E , 1-exp(-mu_EI  *dt));
                  double dN_IR  = rbinom(I , 1-exp(-mu_IR  *dt));
                  S += births - dN_SE;
                  E += dN_SE  - dN_EI;
                  I += dN_EI  - dN_IR;
                  R += dN_IR;
                  W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
                  ")

#########################################################################
#--------------------------|  rinit  |----------------------------------#
#########################################################################
rinit <- Csnippet("
                  double m = eta*N;
                  S = nearbyint(m*S_0);
                  E = nearbyint(m*E_0);
                  I = nearbyint(m*I_0);
                  R = nearbyint(m*R_0);
                  W = 0;
                  ")

#########################################################################
#--------------------------|  dmeas  |----------------------------------#
#########################################################################
dmeas <- Csnippet("
                  // Model for Viral Load
                  double shed_cases = E + I;
                  double mu_V = rho_V*shed_cases;
                  double std_V = sqrt(mu_V*(1+od_V));
                  double lik_V = dnorm(V, mu_V, std_V, 1);

                  // Model for Case Counts
                  double mu_Y = rho_Y*I;
                  double lik_Y = dpois(Y, mu_Y, 1);

                  // Combined likelihood
                  lik = lik_V + lik_Y;
                  lik = (give_log) ? lik : exp(lik);

                  ")
#########################################################################
#--------------------------|  rmeas  |----------------------------------#
#########################################################################
rmeas <- Csnippet("
                  // Viral Load
                  double shed_cases = E + I;
                  double mu_V = rho_V*shed_cases;
                  double std_V = sqrt(mu_V*(1+od_V));
                  V = rnorm(mu_V, std_V);

                  // Case Counts
                  double mu_Y = rho_Y*I;
                  Y = rpois(mu_Y);

                  ")

This is very helpful. Now, as you say, we can see that the model can fit one of the data sets well, but not the other.

It is possible that there is some region of parameter space lurking somewhere within which the model fits much better, but I doubt it. It seems, rather that the model is constrained in some way that prevents it from fitting both data sets. Which is interesting.

It looks as though the Y variable is biased downward, yet it roughly follows the time-course of the data. It is strange that you haven't been able to fix this by increasing rho_Y. If rho_Y is a free parameter, then it would seem you could achieve a better fit simply by increasing it, at no cost.

Thank you so much Dr. King for your help. It turns out I mistakenly multiplied the number of infections by the reporting rate (to get the reported cases) twice! The results were close to the data all this time. I also set rho_Y to a free parameter. Both of the models (with fixed or free rho_Y) are satisfactory now.

Not scaled:

Scaled:

Glad to have helped. I'll close this issue now, but feel free to reopen if more discussion is warranted.

kingaa / pomp

MIF unable to fit to the data #167