kingaa / pomp

R package for statistical inference using partially observed Markov processes
https://kingaa.github.io/pomp
GNU General Public License v3.0
110 stars 26 forks source link

Error in adapting old syntax (traj.match) to the new one(traj_objfun) #197

Closed zz100chan closed 11 months ago

zz100chan commented 11 months ago

Hello, I am trying to test out one of the example shown earlier here by @kingaa from the post (here) But since the example was from an older version, the traj.match does not work anymore and I am trying to adapt it to the newer syntax of traj_objfun with the following code:

 library(pomp)
stopifnot(packageVersion("pomp")>="1.5")
library(plyr)
library(magrittr)
library(readr)
library(reshape2)
library(ggplot2)
options(stringsAsFactors=FALSE)
theme_set(theme_bw())
set.seed(1173439184)

read_csv("LV_data.csv") %>%
  rename(c(Time="time",Species="species")) -> dat

dat %>%
  ggplot(aes(x=time,y=counts,color=species))+
  geom_point(size=1.5)+
  labs(y="counts")

dat %>%
  ddply(~time+species,summarize,m=mean(counts),sd=sd(counts)) %>%
  ggplot(aes(x=time,y=m,ymin=m-sd,ymax=m+sd,group=species,color=species))+
  geom_line()+geom_linerange()+
  labs(y="counts")

dat %>%
  ddply(~species+time,mutate,rep=seq_along(time)) %>%
  dcast(time~species+rep,value.var="counts") %>%
  arrange(time) -> dat2

dat2 %>%
  pomp(
    times="time",
    t0=0,
    skeleton=vectorfield(
      Csnippet("
  DN1 = r1*N1*(K1-N1-a12*N2)/K1;
  DN2 = r2*N2*(K2-N2-a21*N1)/K2;")),
  initializer=Csnippet("
  N1 = N1_0;
  N2 = N2_0;"),
  dmeasure=Csnippet("
  lik = dnbinom_mu(N1_1,1/theta,p*N1,1)+
  dnbinom_mu(N1_2,1/theta,p*N1,1)+
  dnbinom_mu(N1_3,1/theta,p*N1,1)+
  dnbinom_mu(N2_1,1/theta,p*N2,1)+
  dnbinom_mu(N2_2,1/theta,p*N2,1)+
  dnbinom_mu(N2_3,1/theta,p*N2,1);
  lik = (give_log) ? lik : exp(lik);"),
  rmeasure=Csnippet("
  N1_1 = rnbinom_mu(1/theta,p*N1);
  N1_2 = rnbinom_mu(1/theta,p*N1);
  N1_3 = rnbinom_mu(1/theta,p*N1);
  N2_1 = rnbinom_mu(1/theta,p*N2);
  N2_2 = rnbinom_mu(1/theta,p*N2);
  N2_3 = rnbinom_mu(1/theta,p*N2);"
  ),
  statenames=c("N1","N2"),
  paramnames=c("p","theta","r1","r2","a12","a21","K1","K2","N1_0","N2_0")
  ) -> po

Csnippet("
  lik = dnbinom_mu(N1_1,1/theta,p*N1,1)+
  dnbinom_mu(N1_2,1/theta,p*N1,1)+
  dnbinom_mu(N1_3,1/theta,p*N1,1)+
  dnbinom_mu(N2_1,1/theta,p*N2,1)+
  dnbinom_mu(N2_2,1/theta,p*N2,1)+
  dnbinom_mu(N2_3,1/theta,p*N2,1);
")->dmeas

po |> traj_objfun( params=c(K1=180,K2=100,N1_0=10,N2_0=10,r1=0.5,r2=0.5,
  a12=0.3,a21=0.4,theta=0.9,p=0.8),
  dmeasures = dmeas,
  partrans=parameter_trans(
    log="theta",
    logit=c("p", "r1", "r2", "a12", "21")
  ),
  est=c("r1","r2","a12","a21","theta","p"),
  statenames=c("N1","N2"),
  paramnames=c("r1","r2","a12","a21","theta","p"),) -> fit
summary(fit)
coef(fit)

But errors come up firstly from the partrans function with the following error about a parameter of "T_21" which never appears. Removing the partrans, the code did run but then fit gives the same parameter as params, i.e. no estimation is done. I am wondering if you have insights in this issue. Thank you in advance!

> Error: in ‘traj_objfun’: error in building shared-object library from C snippets: in ‘Cbuilder’: compilation error: cannot compile shared-object library ‘C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.dll’: status = 1
> compiler messages:
> gcc  -I"C:/UsersAppData/Local/Programs/R/R-42~1.2/include" -DNDEBUG     -I"c:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -std=gnu99 -mfpmath=sse -msse2 -mstackrealign  -c C:/Users//AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.c -o C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.o
> C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.c: In function '__pomp_to_trans':
> C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.c:33:3: error: 'T_21' undeclared (first use in this function); did you mean 
> In addition: Warning message:
> In system2(command = R.home("bin/R"), args = c("CMD", "SHLIB", "-c",  :
>   running command '"C:/Users/AppData/Local/Programs/R/R-4.2.2/bin/R" PKG_CPPFLAGS="-I"C:/Users/AppData/Local/Programs/R/R-4.2.2/library/pomp/include" -I"C:/Users/Downloads"" CMD SHLIB -c -o C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.dll C:/Users/AppData/Local/Temp/RtmpSyvx3A/15496/pomp_424be9d117fad0c780cfd4a4c4e883e3.c' had status 1
kingaa commented 11 months ago

First of all, I point you to FAQ 1.4. In particular, it appears that the code you are running is from a very old version of pomp and much has changed. Many of these were in the transition to version 2. These are described in the pomp version 2 upgrade guide linked there.

However, not all of the problems were there. Some are due to simple typos (for example you refer to a parameter with the invalid name 21 in your call to parameter_trans. (Incidentally, this is the source of the specific error you mentioned). There are other mistakes as well. Rather than explain these all, I simply ask you to compare your code with the following.

library(tidyverse)
library(pomp)
stopifnot(packageVersion("pomp")>="5.4")
theme_set(theme_bw())
set.seed(1173439184)

Download the data from here: LV_data.csv

read_csv("LV_data.csv") |>
  rename(time=Time,species=Species) |>
  group_by(species,time) |>
  mutate(rep=seq_along(time)) |>
  ungroup() |>
  arrange(time,rep,species) -> dat

dat |>
  ggplot(aes(x=time,y=counts,color=species,
    fill=species,group=species))+
  geom_point(size=2)+
  geom_smooth(se=FALSE,linetype=2)+
  labs(y="counts")

Rplot001

dat |>
  unite(col="name",species,rep) |>
  pivot_wider(values_from="counts") |>
  arrange(time) |>
  traj_objfun(
    times="time",
    t0=0,
    skeleton=vectorfield(
      Csnippet("
      DN1 = r1*N1*(K1-N1-a12*N2)/K1;
      DN2 = r2*N2*(K2-N2-a21*N1)/K2;"
      )
    ),
    rinit=Csnippet("
      N1 = N1_0;
      N2 = N2_0;"
    ),
    dmeasure=Csnippet("
      lik = dnbinom_mu(N1_1,1/theta,p*N1,1)+
      dnbinom_mu(N1_2,1/theta,p*N1,1)+
      dnbinom_mu(N1_3,1/theta,p*N1,1)+
      dnbinom_mu(N2_1,1/theta,p*N2,1)+
      dnbinom_mu(N2_2,1/theta,p*N2,1)+
      dnbinom_mu(N2_3,1/theta,p*N2,1);
      lik = (give_log) ? lik : exp(lik);"
    ),
    rmeasure=Csnippet("
      N1_1 = rnbinom_mu(1/theta,p*N1);
      N1_2 = rnbinom_mu(1/theta,p*N1);
      N1_3 = rnbinom_mu(1/theta,p*N1);
      N2_1 = rnbinom_mu(1/theta,p*N2);
      N2_2 = rnbinom_mu(1/theta,p*N2);
      N2_3 = rnbinom_mu(1/theta,p*N2);"
    ),
    partrans=parameter_trans(
      log=c("theta","r1","r2", "a12", "a21", "K1", "K2"),
      logit=c("p")
    ),
    params=c(
      N1_0=10,N2_0=10,
      r1=0.5,r2=0.5,
      K1=180,K2=100,
      a12=0.3,a21=0.4,
      theta=0.9,p=0.8
    ),
    est=c("r1","r2","a12","a21","theta","p","K1","K2"),
    statenames=c("N1","N2"),
    paramnames=c("p","theta","r1","r2","a12","a21","K1","K2","N1_0","N2_0")
  ) -> fn

library(subplex)
subplex(
  fn=fn,
  par=coef(fn,c("r1","r2","a12","a21","theta","p","K1","K2"),transform=TRUE),
  control=list(maxit=50000)
) -> fit

fit
fn(fit$par)
coef(fn)

fn |>
  trajectory(
    times=seq(0,200,by=1),
    format="data.frame"
  ) |>
  pivot_longer(
    c(N1,N2),
    names_to="species",values_to="counts"
  ) |>
  mutate(
    counts=coef(fn,"p")*counts
  ) -> fitted

dat |>
  ggplot(aes(x=time,y=counts,color=species,
    fill=species,group=species))+
  geom_point(size=2)+
  geom_smooth(se=FALSE,linetype=2)+
  geom_line(data=fitted)+
  labs(y="counts")

Rplot002

Also, note that no optimization is performed because you did not ask for any to be done. traj_objfun merely constructs an objective function. You must run an optimizer of some sort to actually find its minima. See the call to subplex in the above.

kingaa commented 11 months ago

I am closing this issue. Do feel free to reopen if more discussion is warranted.

zz100chan commented 10 months ago

@kingaa Thank you so much and sorry for the late reply as I was trying to adapt the code to my proj. And it is working well! However, I am also trying to adapt this from continuous time into discrete time. And from the tutorial of , it seems like the only the difference between continuous and discrete is how the skeleton was made. So I adjusted the skeleton from this code to the following skeleton=map( Csnippet(" DN1 = r1*N1*(K1-N1-a12*N2)/K1; DN2 = r2*N2*(K2-N2-a21*N1)/K2;" ), delta.t=20 ), And l left the rest the same same as above.

And I obviously I have missed something because fitting results was wrong. image

I wonder if you would have any idea where I should adjust?

kingaa commented 10 months ago

Is there a link missing from your post?

What are you expecting to see?

zz100chan commented 10 months ago

@kingaa I am not so sure what link you were mentioning... Do you mean the graph I attached? or do you mean code? I used the same code you posted earlier and adjusted the skeleton only.

From continuous to discrete, I was expecting similar fitting results but with discrete prediction (on a time scale of 20) from the model. But, obviously I missed some important stuff in the model and the results went wrong... I tried to look up the tutorials in the website but most of them are related to the simulation with a rprocess model For trajectory matching with only rmeasure and dmeasure, I couldn't find any example....

kingaa commented 10 months ago

And from the tutorial of , it seems

It sounds like you may need to read up on discrete vs continuous dynamics. Also the manual pages. Did you look in the Getting Started vignette?

Have you verified that the results you see are in fact incorrect?

zz100chan commented 10 months ago

@kingaa Thanks for the quick reply! Yes, I look up the Getting Started vignette and changed from vectorfield(f) to map(f, delta.t) according to the manual. However, that is the only difference I noticed.

I wouldn't say the results were incorrect as there were no error coming up. But the results are not making sense as they are up to 3e170 for the prey count. That's why I suspect that I missed something in the code. Below is the full code I used, which is almost the same as your code above, except changing the vectorfield() to map() in the skeleton

library(tidyverse)
library(pomp)
#library(dplyr)
stopifnot(packageVersion("pomp")>="5.4")
theme_set(theme_bw())
set.seed(1173439184)

read_csv("LV_data.csv")  |>
  rename(time=Time,species=Species) |>
  group_by(species,time) |>
  mutate(rep=seq_along(time)) |>
  ungroup() |>
  arrange(time,rep,species) -> dat

dat |>
  ggplot(aes(x=time,y=counts,color=species,
             fill=species,group=species))+
  geom_point(size=2)+
  geom_smooth(se=FALSE,linetype=2)+
  labs(y="counts")

dat |>
  unite(col="name",species,rep) |>
  pivot_wider(values_from="counts") |>
  arrange(time) |>
  traj_objfun(
    times="time",
    t0=0,
    skeleton=map(
      Csnippet("
      DN1 = r1*N1*(K1-N1-a12*N2)/K1;
      DN2 = r2*N2*(K2-N2-a21*N1)/K2;"
      ), delta.t=20
    ),
    rinit=Csnippet("
      N1 = N1_0;
      N2 = N2_0;"
    ),
    dmeasure=Csnippet("
      lik = dnbinom_mu(N1_1,1/theta,p*N1,1)+
      dnbinom_mu(N1_2,1/theta,p*N1,1)+
      dnbinom_mu(N1_3,1/theta,p*N1,1)+
      dnbinom_mu(N2_1,1/theta,p*N2,1)+
      dnbinom_mu(N2_2,1/theta,p*N2,1)+
      dnbinom_mu(N2_3,1/theta,p*N2,1);
      lik = (give_log) ? lik : exp(lik);"
    ),
    rmeasure=Csnippet("
      N1_1 = rnbinom_mu(1/theta,p*N1);
      N1_2 = rnbinom_mu(1/theta,p*N1);
      N1_3 = rnbinom_mu(1/theta,p*N1);
      N2_1 = rnbinom_mu(1/theta,p*N2);
      N2_2 = rnbinom_mu(1/theta,p*N2);
      N2_3 = rnbinom_mu(1/theta,p*N2);"
    ),
    partrans=parameter_trans(
      log=c("theta","r1","r2", "a12", "a21", "K1", "K2"),
      logit=c("p")
    ),
    params=c(
      N1_0=10,N2_0=10,
      r1=0.5,r2=0.5,
      K1=180,K2=100,
      a12=0.3,a21=0.4,
      theta=0.9,p=0.8
    ),
    est=c("r1","r2","a12","a21","theta","p","K1","K2"),
    statenames=c("N1","N2"),
    paramnames=c("p","theta","r1","r2","a12","a21","K1","K2","N1_0","N2_0")
  ) -> fn

library(subplex)
subplex(
  fn=fn,
  par=coef(fn,c("r1","r2","a12","a21","theta","p","K1","K2"),transform=TRUE),
  control=list(maxit=50000)
) -> fit

fit
fn(fit$par)
coef(fn)

fn |>
  trajectory(
    times=seq(0,200,by=20),
    format="data.frame"
  ) |>
  pivot_longer(
    c(N1,N2),
    names_to="species",values_to="counts"
  ) |>
  mutate(
    counts=coef(fn,"p")*counts
  ) -> fitted

dat |>
  ggplot(aes(x=time,y=counts,color=species,
             fill=species,group=species))+
  geom_point(size=2)+
  geom_smooth(se=FALSE,linetype=2)+
  geom_line(data=fitted)+
  labs(y="counts")
kingaa commented 10 months ago

It looks to me like you need to improve your basic understanding of dynamical systems theory. I am not in a position to undertake to train you in this. Seek out a teacher or an introductory textbook on dynamical systems or nonlinear dynamics, and learn about discrete-time and continuous-time dynamical systems.

A word of advice: never make the mistake of thinking that because some computer code that you don't understand did not throw an error, that the results were correct. In this case, you don't seem to have a clear idea of what a correct result would look like.