kingaa / pomp

R package for statistical inference using partially observed Markov processes
https://kingaa.github.io/pomp
GNU General Public License v3.0
110 stars 26 forks source link

Discrepancy in objective functions defined using trajectory and simulate #114

Closed gokhale616 closed 4 years ago

gokhale616 commented 4 years ago

Dear Dr. King,

I have recently hit a curious problem pertaining to what I suspect may be an issue with the solutions generated by the differential equation solvers available in the trajectory workhorse. A single line definition of this issue:

Log-likelihood values for the matched trajectory generated using the function that uses workhorse trajectory (here, the method of integration is “ode45”, logLik = -3694.3) is worse than the solution generated by the objective function that uses simulate (logLik = -962.1).

I will really appreciate it if you can please help me confirm this conjecture and if possible recommend a resolution to this issue.

The discrepancy is apparent in the two pdf images in the attached folder - one produced by my objective function --- my_of_prof_lik.pdf, this function uses simulate to simulate over a deterministic process defined as a snippet to the rprocess argument and another, pomp_of_prof_lik.pdf, that uses traj_objfun. The pomp objective function misses the true-likelihood while fitting to simulated data produced using the observation model. Details of the model to follow.

I am attaching a directory --- King_example with my model of interest. The script to model definitions sits in sub-directory 01/cov_make_pomp_model.R. These plots can be reproduced by sourcing: 07/test_trajectory_matching_2.R.

Running this script should prompt installation of a few packages that I use. These are common to pomp users but I apologize in advance if any of these end up cluttering you work space.

Quick description of my model and pomp definitions

I have also added my objective function which I use to calculate likelihood. Please note the extraction of trajectories of true cases - Ci,j, in the above observation model, using simulate and trajectory.

  1. I was under an opinion that if the rprocess snippet defines a deterministic process the solution of the accumvars for trajectory and simulate should be the same? This suggests that I must have this discrepancy due errors in the integrated solution. What is puzzling is the better performance of simulated trajectory. Please note, when I say “simulated”, I am only referring to the accumvars or any state variables to which observation process is not applied.

  2. Can simulate be used for trajectory matching? To me it seems natural that it can be used provided I make use of the accumvars and not the variables having applied the observation noise.

To avoid miscommunication: there are 2 process models, vseir_step --- incorporates demographic stochasticity using the Euler multinomial random draws and vseir_dstep --- is simply the deterministic skeleton written out in the rprocess format so that I can apply observation error to my output later. At least that was my original intent.

This difference disappears if the process is relatively simple, this effect can be reproduced by running the script 07/basics.R which is just an SIR model with seasonality.

The codes are here: example_codes.zip

kingaa commented 4 years ago

@gokhale616: Thanks for providing such a detailed example. There are at least two levels of issues here. One is conceptual, the other technical.

Conceptually, writing a deterministic rprocess component is strange. The entire purpose of the rprocess component is to simulate a stochastic process. If you have a deterministic model for the latent state process, you can use the skeleton component to specify it, and use trajectory or flow to integrate/iterate it. So the answer to your Question 2 above is "No, simulate can't be used in trajectory matching, because trajectory matching is fundamentally inappropriate for stochastic latent state processes."

OK, I hope that is clear. But the issue you raise shouldn't be dismissed so quickly. Indeed, a deterministic process is a (degenerate) Markov process, and one can unambiguously formulate an rprocess simulator for it. So your motivating question remains. This is where at least two technical issues arise.

In the codes you sent, you appear to have written your rprocess snippets incorrectly. Recall from the documentation that, when writing a C snippet for the euler function,

The goal of such a C snippet is to replace the state variables with their new random values at the end of the time interval. Accordingly, each state variable should be over-written with its new value.

However, in your deterministic rprocess C snippet (vseir_dstep), you have the following, for example.

        S(i)  += (uv_births + delta*V(i) - (lambda1 + lambda2 + LV(i) - MU_AGE(i))*S(i))*dt;
        E1(i) += (lambda1*(S(i) + epsilon1*V(i)) - (sigma + LV(i) - MU_AGE(i))*E1(i))*dt;   
        I1(i) += (sigma*E1(i) - (gamma + LV(i) - MU_AGE(i))*I1(i))*dt;
        E2(i) += (lambda2*(S(i) + epsilon2*V(i)) - (sigma + LV(i) - MU_AGE(i))*E2(i))*dt;
        I2(i) += (sigma*E2(i) - (gamma + LV(i) - MU_AGE(i))*I2(i))*dt;
        R(i)  += (gamma*(I1(i) + I2(i)) - (LV(i) - MU_AGE(i))*R(i))*dt;
        V(i)  += (v_births - (epsilon1*lambda1 + epsilon2*lambda2 + delta + LV(i) - MU_AGE(i))*V(i))*dt;

In the second line, the S(i) that you are reading is not the same as the S(i) that was read in the previous line. It has been overwritten as a result of the first line. In the same way, the E1(i) on the second line is not that encountered on the third, and so on. Of course, your equations are invalid unless these are the same (or, if you prefer, the simulator is a valid simulator for a very different model!).

Although this may not be the only problem in the codes, it alone is sufficient to give very different results. One way to avoid this problem is to create temporary local variables to hold the terms of the difference equations and then update the state variables all at once using only these local variables.

Of course, you should know too that there is no reason to expect the Euler method for obtaining approximate solutions to a differential equation to agree with higher-order methods such as are employed by trajectory. That is, even if you correct the problem I indicate above (and assuming there are no others), there will be some discrepancy between the trajectories generated by simulate using the Euler method and trajectory, which uses a much more sophisticated solver. This discrepancy will tend to zero as the time-step tends to zero, but it may do so slowly and non-uniformly with respect to time.

gokhale616 commented 4 years ago

@kingaa - The explanation you provide makes proper sense both conceptually and technically!

Things fell in their right place when you pointed out the mistake in my vseir_dstep. By that logic, since my skeleton which does not have the error that you mention in your comment, as it conforms to the "pomp" requirements of specifying temporary "D variables", will always fit poorly to the simulated trajectories generated by this incorrect snippet!

Thank you so much for this clarification. Sleep can now be restored!

Side note: The point of writing the strange vseir_dstep was to be able to apply my observation model to my deterministic trajectories. Now I understand, even if I had the rprocess snippet correctly defined for the Euler method, with the necessary temporary variables, the simulated trajectories would have not corresponded to the solution obtained by the method to the higher order method that I specify in my comment earlier.

In the past I have applied the observation model explicitly after I have my solutions from using - trajectory/flow. Is there a way to be able to do that using pomp functions? This is simply to make my codes nifty, simply curious at this point.

kingaa commented 4 years ago

@gokhale616: Glad this helped.

One minor point:

...since my skeleton which does not have the error that you mention in your comment, as it conforms to the pomp requirements of specifying temporary "D variables"

In writing a skeleton snippet, the "D variables" are not really temporary variables. They are, however, distinct from the state variables. You seem to be a decent C programmer. Have a look at what spy shows you so you can understand just what the "D variables", the state variables, parameters, and covariates are at the C level!

I sympathize with your desire to look at the problem in several different ways. It's generally a very good thing to do.

To answer your question, once you've computed deterministic trajectories of your state variables using trajectory, you can use the dmeasure and rmeasure workhorses to apply the observation model directly. In essence, this is exactly what the traj_objfun-created objective function does.

I'll close this issue for now, but feel free to re-open if more discussion is warranted.

bogaotory commented 4 years ago

Can I just piggyback on this thread to dig a bit further on the difference in implementation for simulation and trajectory in pomp please. You pointed out the difference in deterministic and stochastic models in your comment which makes a lot of sense, and thus the distinction between simulate and trajectory. What I found a little difficult to understand whilst reading Getting started with pomp is the switch from having state variables as returns in rprocess to having the derivatives of those state variables (or "D variables") returned within skeleton. And there seems to be an additional flow layer to look after the function given in skeleton because of this. So my first question is why should we be tracing derivatives when it comes to deterministic models?

To see if things would work without switching to derivatives, I wrote a tester script (models the function y=x+1) which I attached at the end. And it seems the same stepper function foo_func_step is acceptable for both simulate and trajectory if I consider the function a discrete-time process. One thing I haven't worked out is how do I modify foo_func_step as input for vectorfield to model the function as a continuous-time process. I tried around c(DA=1) but I couldn't quite get it to work. So my second question is how do I modify foo_func_step as a continuous-time model, or am I misunderstanding something entirely somewhere?

Thanks very much!

rm(list = ls())

library("tidyverse")
library("ggplot2")

library("pomp")

foo_rinit <- function (A_0, ...) {
    c(A = A_0)
}

foo_dmeas <- function (A, bA, obsA, ..., log) (
    lik = dpois(obsA, bA*A, log = log)
)

foo_rmeas <- function (A, bA, ...) {
    c(
        obsA = rpois(n = 1, lambda = bA * A)
    )
}

foo_func_step <- function (A, t, ...) {
    c(A = A + 1)
}

foo_pars = c(
    A_0 = 0,
    bA = 0.9
)

time_horizon = 1:50
data_dummy = data.frame("day" = time_horizon, "obsA" = rep.int(2, length(time_horizon)))

data_dummy %>%
pomp(
    times = "day",
    # data = data_dummy,
    t0 = 0,
    rinit = foo_rinit,
    skeleton = map(
        f = foo_func_step,
        delta.t = 1
    ),
    # skeleton = vectorfield(
    #     f = foo_func_step
    # ),
    # statenames = c("A", "DA"),
    rprocess = discrete_time(
        step.fun = foo_func_step,
        delta.t = 1
    ),
    rmeasure = foo_rmeas,
    dmeasure = foo_dmeas
) -> foo_pomp

foo_pomp %>%
simulate(
    params = foo_pars,
    format = "data.frame"
) -> foo_sims

foo_pomp %>%
pomp::trajectory(
    params = foo_pars,
    format = "data.frame"
) -> foo_trj

plot(foo_sims)
plot(foo_trj)
kingaa commented 4 years ago

You are indeed digging deeper! You will not be surprised, then, if things become a bit more complex.

Your first question is "Why should we be tracing derivatives when it comes to deterministic models?" Well, when one has a continuous-time, finite-order, Markovian, deterministic system, and its trajectories are smooth in the time direction, it is quite natural, and traditional, to encode the system as a set of ordinary differential equations, which is to say a vectorfield, an expression giving the time-derivative of the trajectory at each point of the state space. Since this is so common, pomp makes it possible to write a deterministic skeleton in this way. The natural way to do this is to give the user the ability to specify the dependence of dx/dt on x and t, where x is the state vector.

As you have discovered, in the discrete-time case, the skeleton is a map, i.e., a function taking us from x(t) to x(t+1). (But note that, in general, the time-step need not be 1 unit).

Now you notice that, if one is in the discrete-time case, the same R function that is used to specify the skeleton works also when passed via the rprocess argument (and is thus used by simulate). If one tries this with a C snippet, however, it fails. Why?

First, some background. C snippets are designed to accelerate computation by stripping out as much unnecessary type-checking, name-checking, argument-matching, type-conversion, data-copying, etc. as possible. When one uses an R function, a lot of this happens, and it cannot be eliminated without introducing the possibility of dangerous and difficult-to-trace bugs into the code, not to mention incurring the justifiable ire and disdain of the R core team. Contrariwise, when one uses C snippets, these are compiled into a dynamically linkable library, which is loaded into a running R session, which gives one the ability to bypass all (or almost all) of the above-mentioned unnecessary run-time checking, copying, etc. Naturally, I, as pomp developer, must then take on myself the burden of ensuring that the resulting operations will be valid and safe.

In particular, in writing C snippets for rprocess, one must replace the state variables with their new values: no copying is done. In C snippets for skeleton vectorfields, on the other hand, there is no question of replacing state variables; there is no reason to do so. The point of such a snippet is to return the time-derivatives. In fact, one is not allowed to change the state variables, for doing so would violate the DE solver's assumptions and thereby potentially invalidate the computation. What about in the map case? Well, one could do it another way, but in pomp, I have opted to make the rules for C snippets for skeleton the same in both vectorfield and map cases, rather than have different rules.

Finally, you ask how you might modify an rprocess C snippet, corresponding to a continuous-time system, for use as a skeleton snippet. The most straightforward answer is, you can't. The skeleton snippet must compute the vectorfield, which is not at all the same thing as a step of the system. A less straightforward, but perhaps revealing, answer would be that if you can integrate the vectorfield yourself (either analytically or by calling your own numerical integrator inside the C snippet), you then have turned your continuous-time system into a discrete-time one (from pomp's perspective), and you are free to use the map function in the skeleton argument of pomp.

bogaotory commented 4 years ago

A less straightforward, but perhaps revealing, answer would be that if you can integrate the vectorfield yourself (either analytically or by calling your own numerical integrator inside the C snippet), you then have turned your continuous-time system into a discrete-time one (from pomp's perspective), and you are free to use the map function in the skeleton argument of pomp.

This is exactly what I have done in my code. I was going to mention it as a third question, but then I thought I know it works anyway and did not include it in my previous comment. It is good to know that I'm not overly "abusing" the system. Thank you!

My use case is that we have an ODE system (vectorfield) that we use with deSolve already, and we also have an Rcpp version of it to run faster. We also don't have data/observation on the many state variables we have in that ODE, as you know from my other question. So I wrapped the ODE in discrete time steps, also summarising the state variables (of the base ODE) to get the state variables I want pomp to observe. If this all makes sense.

The only downside of this is that it slows down the code quite a bit (about 3 times slower). This is obviously caused by the "pauses" between each discrete step. But then again, when multiple simulations/trajectories are run (either through nsim, Np, or having multiple columns in params), pomp's algorithm is designed to schedule all instances to be run in each discrete step so that it could compare likelihoods etc., so that pause is unavoidable as I currently understand. I will take another look at the performance once everything is working correctly.

kingaa commented 4 years ago

I'll close the issue now. Feel free to reopen if more discussion is warranted.