kingaa / pomp

R package for statistical inference using partially observed Markov processes
https://kingaa.github.io/pomp
GNU General Public License v3.0
114 stars 27 forks source link

how to put a periodic cubic b-spline basis in a c snippet #95

Closed belialhou closed 5 years ago

belialhou commented 5 years ago

Hello, I need your help! When I write a code for a SIR model which has to use a periodic cubic b-spline basis to represent beta, I don't know how to put it in the step function. This step function is coded in C snippet. Every time I use function periodic_bspline_basis_eval_t to code it, R always says error: unexpected type name 'periodic_bspline_basis_eval_t': expected expression Beta = exp(p. My program is given as following.

seas.sir.step <- "
  double rate[12];
  double dN[12];
  double Beta;
  double dW;
  Beta = exp(periodic_bspline_basis_eval_t (x=Phi,period=2,degree=3, nbasis=6,&(b0,b1,b2,b3,b4,b5)));
  rate[0] = birthrate;
  rate[1] = Beta*I/popsize + omega; 
  rate[2] = m; 
  rate[3] = mc; 
  rate[4] = m; 
  rate[5] = gamma; 
  rate[6] = m;  
  rate[7] = m; 
  rate[8] = m; 
  rate[9] = r*3;  
  rate[10] = r*3;  
  rate[11] = r*3; 
  dN[0] = rpois(rate[0] * dt);
  reulermultinom(2, S, &rate[1], dt, &dN[1]);
  reulermultinom(2, I, &rate[5], dt, &dN[2]);
  reulermultinom(1, I, &rate[3], dt, &dN[11]);
  reulermultinom(1, R1, &rate[9], dt, &dN[3]);
  reulermultinom(1, R2, &rate[10], dt, &dN[4]);
  reulermultinom(1, R3, &rate[11], dt, &dN[5]);
  dW = rnorm(dt, eps * sqrt(dt));
  S += dN[0]-dN[1]-dN[6]+dN[5];
  I += dN[1]-dN[2]-dN[11]-dN[7];
  R1 += dN[2]-dN[3]-dN[8];
  R2 += dN[3]-dN[4]-dN[9];
  R3 += dN[4]-dN[5]-dN[10];
  Phi += dW;
  H += dN[11];
  noise += (dW-dt)/eps;
"
kingaa commented 5 years ago

@belialhou:

The function call for evaluation of a periodic b-spline basis is periodic_bspline_basis_eval. The following evaluates the 6 basis functions (of degree 3 and period 2) at time t:

double seas[6];
periodic_bspline_basis_eval(t,2,3,6,seas);

Note that t, the current time, is automatically available in the C snippet. Note also that C does not support named arguments. If you are intending to evaluate the above at a random phase Phi, then simply replace t in the above with Phi.

When the above is executed, the array seas contains the values of the 6 basis functions. To multiply each by the corresponding coefficient (which I gather you have as b0, ..., b5), you could do the following:

Beta = exp(b0*seas[0]+b1*seas[1]+b2*seas[2]+b3*seas[3]+b4*seas[4]+b5*seas[5]);

There is also a dot_product facility that is available. For example, if you could guarantee that b0,...,b5 will always be adjacent to one another in the parameter vector, and always in that specific order, then you could do

Beta = exp(dot_product(6,seas,&b0));

to achieve the same effect.

As an unrelated aside, I note that you have two noncompeting Euler multinomial exit processes associated with your I compartment:

  reulermultinom(2, I, &rate[5], dt, &dN[2]);
  reulermultinom(1, I, &rate[3], dt, &dN[11]);

which makes it possible for one I individual to exit both by one of pathways 5 or 6 and by pathway 3 simultaneously. Perhaps this is not what you intend.

belialhou commented 5 years ago

@kingaa Dear Prof. King, Thank you for helping me. It's a pleasure to receive your answer. I still have a question for the second part that you mentioned. Should I delete one of those two equations? reulermultinom(2, I, &rate[5], dt, &dN[2]); reulermultinom(1, I, &rate[3], dt, &dN[11]); I try to actualise the results of your paper, named Inference for nonlinear dynamical systems. N[2] means the number of patients from the infected state to the recovering 1 state and N[11] represents the number of patients from the infected state to death caused by cholera.

kingaa commented 5 years ago

If you are trying to recreate the results from the cholera models, you would do better to look at the 2008 Nature paper. The code for those models is included in the package. See ?dacca. You can of course study the underlying pomp construction by looking at the source code.

More generally, when constructing a POMP for a compartmental model using the Euler multinomial approximation, one needs exactly one call to reulermultinom for each compartment. If there are n pathways leading out of a compartment with occupancy X, then one has to pack the corresponding n rates into contiguous memory locations and retrieve the results in contiguous memory locations. In particular, if rate is a pointer to n contiguous memory locations holding the rates and dX is a pointer to n contiguous memory locations ready to hold the results, then

reulermultinom(n,X,rate,dt,dX);

will result in a random sample from the Euler multinomial distribution (with timestep dt) being stored in dX[0], ..., dX[n-1].

kingaa commented 5 years ago

See also the new FAQ 3.6.

belialhou commented 5 years ago

@kingaa Thank you, Prof. King. I will read all of those.

belialhou commented 5 years ago

Hello, Prof. King. Thank you for reading my question. I still have some problems for the Euler multinomial distribution. It says that when constructing a POMP for a compartmental model using the Euler multinomial approximation, one needs exactly one call to reulermultinom for each compartment. How to pack the corresponding n rates into contiguous memory locations? Does it mean I have to choose a path and use the total number of paths as n for each compartment? For example, the process as following.

Screenshot 2019-08-07 at 12 16 08 PM

I use reulermultinom(2, S, &rate[1], dt, &dN[1]); for transmission from S to I at the interval delta t, reulermultinom(3, I, &rate[5], dt, &dN[2]); for transmission from I to R1 at the interval delta t, reulermultinom(2, R1, &rate[9], dt, &dN[3]); for transmission from R1 to R2 at the interval delta t, reulermultinom(2, R2, &rate[10], dt, &dN[4]); for transmission from R2 to R3 at the interval delta t, and reulermultinom(2, R3, &rate[11], dt, &dN[5]); for transmission from R3 to S at the interval delta t. But when I use this to represent the whole process, I can't recreate the result. My step.fun is as following.

seas.sir.step <- "
  double rate[12];//the number of rate
  double dN[12];//the number of states
  double Beta;
  double dW;
  double seas[6];
  periodic_bspline_basis_eval(t,2,3,6,seas);
  Beta = exp(b0*seas[0]+b1*seas[1]+b2*seas[2]+b3*seas[3]+b4*seas[4]+b5*seas[5]);
  rate[0] = popsize*m;
  rate[1] = (Beta+eps*Phi)*I/popsize + omega;
  rate[2] = m;
  rate[3] = mc; 
  rate[4] = m;
  rate[5] = gamma; 
  rate[6] = m;
  rate[7] = m;
  rate[8] = m;
  rate[9] = r*3;
  rate[10] = r*3;
  rate[11] = r*3;
  dN[0] = rpois(rate[0]*dt+dpopdt);
  reulermultinom(2, S, &rate[1], dt, &dN[1]);
  reulermultinom(3, I, &rate[5], dt, &dN[2]);
  reulermultinom(2, R1, &rate[9], dt, &dN[3]);
  reulermultinom(2, R2, &rate[10], dt, &dN[4]);
  reulermultinom(2, R3, &rate[11], dt, &dN[5]);
  dW = rnorm(dt, S*sqrt(eps*dt));
  S += dN[0]-dN[1]-dN[6]+dN[5];
  I += dN[1]-dN[2]-dN[7]-dN[11];
  R1 += dN[2]-dN[3]-dN[8];
  R2 += dN[3]-dN[4]-dN[9];
  R3 += dN[4]-dN[5]-dN[10];
  H += dN[11];
  Phi += dW/dt;
"
kingaa commented 5 years ago

Since there is a one-to-one correspondence between rates and random variables, I find it easiest to pack the rates into a single vector (as you have) and to extract the corresponding random draws in another vector of exactly the same size.

However, one must ensure that the rates pertaining to a particular compartment are contiguous.

To be concrete, I revisit your code:

seas.sir.step <- "
  double rate[12];
  double dN[12]; //will hold random draws (in 1-1 correspondence with rates)
  double Beta;
  double seas[6];
  periodic_bspline_basis_eval(t,2,3,6,seas);
  Beta = exp(b0*seas[0]+b1*seas[1]+b2*seas[2]+b3*seas[3]+b4*seas[4]+b5*seas[5]);
\\ into S:
  rate[0] = popsize*m;
\\ from S:
  rate[1] = (Beta+eps*Phi)*I/popsize + omega;
  rate[2] = m;
\\ from I
  rate[3] = mc; 
  rate[4] = m;
  rate[5] = gamma;
\\ from R1
  rate[6] = 3*r;
  rate[7] = m;
\\ from R2
  rate[8] = 3*r;
  rate[9] = m;
\\ from R3
  rate[10] = r*3;
  rate[11] = m;
  dN[0] = rpois(rate[0]*dt+dpopdt);  \\ is the rate parameter correct here?
  reulermultinom(2, S, &rate[1], dt, &dN[1]);
  reulermultinom(3, I, &rate[3], dt, &dN[3]);
  reulermultinom(2, R1, &rate[6], dt, &dN[6]);
  reulermultinom(2, R2, &rate[8], dt, &dN[8]);
  reulermultinom(2, R3, &rate[10], dt, &dN[10]);
  S += dN[0]-dN[1]-dN[2]+dN[10];
  I += dN[1]-dN[3]-dN[4]-dN[5];
  R1 += dN[5]-dN[6]-dN[7];
  R2 += dN[6]-dN[8]-dN[9];
  R3 += dN[8]-dN[10]-dN[11];
"

Notice that I've dropped the dW and Phi variables, since they were playing no role in the dynamics at all. I've also left out the H variable, since it isn't clear what it was intended to do. I suspect it is intended to accumulate counts that will be the basis for your measurements, but which counts are you accumulating?

I suggest you also revisit the dN[0]= line: is the Poisson parameter what it should be?

I also recommend that you spend a bit more time studying pointer arithmetic in C. The fundamental facts of pointer arithmetic are

*&x = x;
y[n] = *(y+n);

but it is worth working through some exercises to make sure you understand the implications.

belialhou commented 5 years ago
Screenshot 2019-08-08 at 7 30 30 PM

This is the differential equation of transition from states birth to susceptible. So I thought using dp and population times m to the mean of poisson distribution. Moreover, H represents the number of death caused by cholera, that is,

Screenshot 2019-08-08 at 7 40 05 PM.

Since observed mortality has relationship with it, I put it in step.fun. dw means Gaussian white noise. Should I delete them from C snippet?

kingaa commented 5 years ago

This is the differential equation of transition from states birth to susceptible. So I thought using dp and population times m to the mean of poisson distribution.

OK. I thought, because of its name, that dpopdt might be dP/dt instead of dP.

Moreover, H represents the number of death caused by cholera

Since you want to accumulate cholera mortality, you should accumulate the dN[3] variable in my version of your code above.

dw means Gaussian white noise. Should I delete them from C snippet?

If you want to keep track of this variable, then keep it. Since it has no effect on the dynamics at all, I don't see why you'd want to do this, but perhaps you have your reasons.

belialhou commented 5 years ago

Prof. King, I still feel I still have some errors which i'm unable to solve based on your previous answers. This is process diagram.

Screenshot 2019-08-07 at 12 16 08 PM

And this is the differential equations.

Screenshot 2019-08-14 at 7 35 45 PM Screenshot 2019-08-14 at 7 35 51 PM

Note that transmission parameters beta(t) and omega(t) have seasonal variation during the year. The data on observed mortality were modeled as image Is the Gaussian white noise Phi = dW/dt chosen from as following distribution?

Screenshot 2019-08-14 at 7 39 27 PM

this is my whole programming.

dacca1<-simulate(dacca())
time<-dacca1@covar@times
simdata<-dacca1@covar@table
simdata<-data.frame(time,t(simdata))  ## the same population and time as dacca

rmeas <- Csnippet("
          obs_death = rnorm(cholera_death,tau*cholera_death); 
                  ")
dmeas <- Csnippet(" 
          lik = dnorm(obs_death,cholera_death,tau*cholera_death,give_log);
" )

## initializer 
rinit <- "
double q = pop/(S_0 + I_0 + R1_0+ R2_0+ R3_0);
  S = nearbyint(q*S_0);
  I = nearbyint(q*I_0);
  R1 = nearbyint(q*R1_0);
  R2 = nearbyint(q*R2_0);
  R3 = nearbyint(q*R3_0);
  Phi = 0;" 

seas.sir.step <- "
  double rate[12];
  double dN[12]; //will hold random draws (in 1-1 correspondence with rates)
  double Beta;
  double omega;
  Beta = exp(b0*seas_1+b1*seas_2+b2*seas_3+b3*seas_4+b4*seas_5+b5*seas_6);
  omega = exp(a0*seas_1+a1*seas_2+a2*seas_3+a3*seas_4+a4*seas_5+a5*seas_6);
  rate[0] = pop*m;
  rate[1] = (Beta+eps*Phi)*I/pop + omega;//into S
  rate[2] = m;//from S
  rate[3] = mc; 
  rate[4] = m;
  rate[5] = gamma;//from I
  rate[6] = 3*r;
  rate[7] = m;//from R3
  rate[8] = 3*r;
  rate[9] = m;//from R2
  rate[10] = 3*r;
  rate[11] = m;// from R3
  dN[0] = rpois(rate[0]*dt+dpopdt);
  reulermultinom(2, S, &rate[1], dt, &dN[1]);
  reulermultinom(3, I, &rate[3], dt, &dN[3]);
  reulermultinom(2, R1, &rate[6], dt, &dN[6]);
  reulermultinom(2, R2, &rate[8], dt, &dN[8]);
  reulermultinom(2, R3, &rate[10], dt, &dN[10]);
  Phi= rnorm(dt,eps*sqrt(dt));
  S += dN[0]-dN[1]-dN[2]+dN[10];
  I += dN[1]-dN[3]-dN[4]-dN[5];
  R1 += dN[5]-dN[6]-dN[7];
  R2 += dN[6]-dN[8]-dN[9];
  R3 += dN[8]-dN[10]-dN[11];
  cholera_death += dN[3];"

simulate(
  times=dacca1@times, t0=1891,
  covar=covariate_table(simdata,times='time',gamma=20.8,mc=0.046,m=1/600,r=1/120),
  rmeasure = rmeas,
  dmeasure = dmeas,
  rinit=Csnippet(rinit),
  rprocess = euler(
    step.fun = Csnippet(seas.sir.step), 
    delta.t = 1/12/20),
  statenames = c("S", "I", "R1","R2","R3","Phi","cholera_death"),
  obsnames=c("obs_death"),
  paramnames = c("b0","b1","b2","b3","b4","b5","a0","a1","a2","a3","a4","a5",
                 "eps","tau", "S_0", "I_0","R1_0","R2_0","R3_0"),
  partrans=parameter_trans(log=c("eps","b0","b1","b2","b3","b4","b5","a0","a1","a2","a3","a4","a5")),
  accumvars = c("cholera_death","Phi"),
  params = c(omega=1.76e-4,b0=0.747, b1=6.38, b2=-3.44,b3=4.23,b4 =3.33, b5=4.55,
             a0=-1.692819521 ,a1=-2.543383580,a2=-2.840439389,a3=-4.691817993,a4=-8.477972478,a5=-4.390058806,
             tau=0.23,eps=3.13,S_0=0.621,I_0=0.378,R1_0=8.43e-4,R2_0=9.72e-4,R3_0=1.16e-7)
)-> cholera

I used the data from example dacca, such as time, population, dpopdt and b-spline basis. It can run and doesn't show bugs. But my result is totally different from the simulation which is from dacca. This is output of simulation. Rplot I have read paper which you recommended and followed a step by step to build this code. I'm suspecting that my Csnippets for the the deterministic and stochastic skeleton might be the problem. Could you please check if my Csnippet for the model described above are properly coded for pomp? Thanks in advance for your help.

kingaa commented 5 years ago

A cursory inspection of your code reveals that it differs in some ways from the model produced by dacca(). In addition, the parameters appear to be different.

Why not compare your code directly with that of dacca() (e.g., do dacca at the prompt or read the source code)? You should also compare the parameter values carefully.

belialhou commented 5 years ago

@kingaa Prof. King, thank you for your advice! I got a way to do it.