rmeas and dmeas for random population samples (not incidence) data

kingaa / pomp

R package for statistical inference using partially observed Markov processes

https://kingaa.github.io/pomp

GNU General Public License v3.0

114 stars 27 forks source link

rmeas and dmeas for random population samples (not incidence) data #126

Closed mayagroner closed 3 years ago

mayagroner commented 4 years ago

I’m hoping that you may be able to provide some advice about how to set up my measurement model. My datasets do not include incidence data as shown in most of the papers I have looked. Instead, I have time series data for 3 randomly selected sets of 60 fish that were tested for infection presence/absence. We also have estimates of the total population size at each timepoint. The disease is very slow progressing, so samples are taken yearly over 14 years. I am stuck on how to code the measurement model. If you have any advice and/or could point me towards a paper that has a similar data structure, that would be helpful. Many thanks!

kingaa commented 4 years ago

It sounds like it should be possible to do this. If I understand you correctly, you have samples of fish and your presence/absence assay is a direct estimator of prevalence of infection. For example, suppose you have a sample of size n and you observe k infected fish in the sample. Assuming the test is perfect, the sample size n is much smaller than the full population size, and that infection doesn't alter the probability of being sampled, then k ~ Binomial(n,p), where p is the prevalence of infection. Perhaps this is a place to begin the conversation.

You mention that you have multiple samples. How are these related to one another?

mayagroner commented 4 years ago

Yes. You are correct. We have estimates of infection prevalence and the binomial sampling design makes sense. The sample size is ~5-6 orders of magnitude lower than the population size (depending on the year, these populations are highly variable across years, something that we think may be contributing to the infection dynamics). Typically we sample the population three times at three locations each spring using a cast net or purse seine. I consider this a nested sampling design where is a random effect of sampling event (e.g., fish caught in the same net may not be independent). Thank you again for your help!

kingaa commented 4 years ago

How do you think about the random effect of sampling event? How are the fish caught in one net more or less similar to fish caught in different nets, or not at all?

Also, you mentioned that you have data on the population size itself. What are the errors associated with those data?

You can see that the questions I'm asking are designed to help you focus on modeling the measurement error. You may be able to come up with a mechanistic model of the measurement error. While such a model would be interesting, there is of course no guarantee that it would explain the data better than another model. For this reason, I'd recommend comparing it with a more flexible model.

mayagroner commented 4 years ago

We see some variation in age among the fish caught in each sampling event and we know there is variation in prevalence across age groups. (Eventually, I think an age-structured disease model may be necessary, but I think a basic SI model is a good place to start). There may be other sources of variation that we have not assessed (e.g., condition of fish- perhaps healthier fish aggregate towards the front of the schools and less healthy fish towards the back). I don't think we understand the schooling behavior well enough to make a mechanistic model of the measurement error. We don't think that there is any bias in terms of catch rates. The prevalence within a sampled net is representative of the prevalence in the local area.

Our population estimates come from a stock assessment model that is informed by multiple data sources. We have estimates and 95% CIs associated with each age class for each study year.

kingaa commented 4 years ago

What would you think, then, about starting with a Binomial model for the prevalence -> sample positive connection, and a normal or lognormal model for the population estimate error? I don't know much about stock assessments....

However, perhaps we should be worried if we use anything other than a Poisson model for fish. ;-)

mayagroner commented 4 years ago

That sounds like a good start! Thank you for the help and the joke.

mayagroner commented 4 years ago

Would this be the right way to code that? For the data, popdata is the population size data, sigmadata is standard deviation of the population size data, positive are the disease positive samples, and negative are the disease negative samples.

I have several lik terms in si_dmeas. Should these be named lika, likb, likc, likd? I'm not sure how to tell the model to maximize the likelihood against all of these sources of data.

Thank you again for your help.

si_dmeas <- Csnippet("
  lik = dbinom(positive,H,rho,give_log);
  lik = dbinom(negative,pop-H,rho,give_log);
  lik = dlnorm(popdata, exp(pop), sigmadata, give_log);
  lik = dlnorm(sigmadata, exp(sigma), sd, give_log);
 ")

si_rmeas <- Csnippet("
  popdata = rlnorm(exp(pop), sigma);
  sigmadata = rlnorm(exp(sigma), sd);  
  positive = rbinom(H,rho);
  negative = rbinom(pop-H, rho)
  ")

kingaa commented 4 years ago

Have a look at the following.

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive,positive+negative,I/pop,give_log);
  lik2 = dlnorm(popdata, log(pop), sigmadata, give_log);
  if (give_log) lik = lik1+lik2; else lik = lik1*lik2;
 ")

A few things to note:

I create two local double-precision variables, lik1 and lik2 to temporarily hold two components of the likelihood. These get filled with likelihood or log-likelihood as requested (according to the give_log flag). Accordingly, the lik is set equal to the product or the sum, respectively.
I'm assuming the latent number of infectives is a variable called I and the total population size is in a variable called pop. [I think maybe in your model, pop is the log of the population size, but I don't know. You didn't mention what the state variables signify.]
The above assumes that sigmadata is somehow available (as a state variable, a parameter, a covariate, or a datum). Since it's not being treated as informative per se, it logically makes the most sense to think of it as a parameter (if its a constant) or a covariate (if it varies with time).
The lognormal assumption assumes that log(popdata) is normally distributed with mean log(pop) and s.d. sigmadata.

The above assumes that positive and negative are both data variables. When we come to try and write the rmeasure component, we will see that this is not very convenient. Since positive+negative is equal to the sample size, it makes sense to treat the latter as a covariate, assuming it is not itself informative. If we do this, letting samplesize = positive + negative, and assuming it's a covariate, we have the following.

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive,samplesize,I/pop,give_log);
  lik2 = dlnorm(popdata, log(pop), sigmadata, give_log);
  if (give_log) lik = lik1+lik2; else lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/pop);
  popdata = rlnorm(popdata, log(pop), sigmadata, give_log);
 ")

If, on the other hand, the sample size is itself informative (perhaps about the population size), then one should model that too.

kingaa commented 4 years ago

Closing this now, but feel free to re-open if more discussion is warranted.

mayagroner commented 4 years ago

Thank you again for your help. I have a simple model coded using the dmeas and rmeas code you suggested, but I am getting some errors when I try to construct the pomp object. I've gone over it several times and I am not sure where the error is, but I have posted the code and error message below. The data that go with the code are positive (# positives), sample size (# number tested for disease), maturepopmean and maturepopsd (the estimated population size and standard deviation), and recruitment and recruitment sd (the number and sd of individuals recruiting into the mature population). I will directly email you the datasets associated with these data. Thank you again. The learning curve for this package is steep and the help is very appreciated!

The code:

library(doParallel)
library(doRNG)
library(readr)
registerDoParallel()
registerDoRNG(2488820)

library(tidyverse)

read_csv(paste0("C:/Users/mayag/Dropbox/Ichthyophonus/Ich mort model/pomp model/PWS disease survey.csv")) -> meas
meas %>% as.data.frame() %>% head()

#plot sample prevalence
meas %>%
  ggplot(aes(x=year,y=positive/samplesize))+
  geom_line()+
  geom_point()

#plot mature population
meas %>%
  ggplot(aes(x=year,y=maturepopmean))+
  geom_line()+
  geom_point()

read_csv(paste0("C:/Users/mayag/Dropbox/Ichthyophonus/Ich mort model/pomp model/PWS recruitment.csv")) -> covartable

covartable<-covartable[covartable$year>2006,]
covartable<-covartable[covartable$year<2019,]

#plot recruitment
covartable %>%
  ggplot(aes(x=year,y=recruitment))+
  geom_line()+
  geom_point()

library(pomp)

rprocSI <- Csnippet("
  double beta, rec, foi, mu, mud, pos, pop, q;
  double rate[3], trans[3];

  foi = beta*S*I/(pop^q); //q determines the degree to which frequency or density dependence is occurring (q = 0 for dd, q = 1 for fd)

  rate[0] = foi;  //force of infection
  rate[1] = mu;         // natural death
  rate[2] = mud;         // disease death

  // Recruitment is randomly picked from the distribution provided in the data
  rec = rnorm(recruitment*dt, recruitmentsd*dt);

  // transitions between classes
  reulermultinom(2,S,&rate[0],dt,&trans[0]);

  S += rec*(1-pos)    - trans[0] - trans[1]; //recruits can be disease positive or negative
  I += rec*(pos)+trans[0] - trans[1] - trans[2];
  I = pop - S;               // Is this ok to have I included twice? I need to include pop in here somewhere
  H += trans[0];           // true incidence, this may not be necessary to record
")

si_rinit <- Csnippet("
  S = nearbyint(S_0);
  I = nearbyint(pop-S_0);
  H = 0;
 ")

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive,samplesize,I/pop, give_log);
  lik2 = dlnorm(maturepopmean, log(pop), maturepopsd, give_log);
  if (give_log) lik = lik1+lik2; else lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/pop);
  maturepopmean = rlnorm(maturepopmean, log(pop), maturepopsd, give_log);
 ")

#---------------------------------------------
pt <- parameter_trans(
  log=c("pop", "foi", "mu", "mud", "beta", "rec"),
  logit=c("pos", "q")
)

meas %>%
  pomp(time="year", t0=0, rprocess=euler(rprocSI,delta.t=1/12),
       rinit=si_rinit,
       rmeasure=si_rmeas,
       dmeasure=si_dmeas,
       accumvars=c("H"),
       covar=covariate_table(covartable,times="year"),
       obsnames = c("maturepopmean", "maturepopsd", "positive", "samplesize"),
       statenames=c("S", "I", "H"),
       paramnames=c("beta", "pop", "rec", "foi", "pos","mu", "mud", "q", "S_0"), 
       cdir=".", cfile="measSI" 
  ) -> measSI

Associated error message:

Error: error in building shared-object library from C snippets: in ‘Cbuilder’: compilation error: cannot compile shared-object library ‘./measSI.dll’: status = 1
compiler messages:
"C:/rtools40/mingw64/bin/"gcc  -I"C:/PROGRA~1/R/R-40~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c measSI.c -o measSI.o
measSI.c:12: warning: "beta" redefined
 #define beta  (__p[__parindex[0]])

In file included from C:/Users/mayag/Documents/R/win-library/4.0/pomp/include/pomp.h:9,
                 from measSI.c:5:
C:/PROGRA~1/R/R-40~1.3/include/Rmath.h:212: note: this is the location of the previous definition
 #define beta  Rf_beta

measSI.c: In function '__pomp_stepfn':
measSI.c:52:16: error: '__p' redeclared as different kind of symbol
 #define beta  (__p[__parindex[0]])
                ^~~
measSI.c:70:10: note: in expansion of macro 'beta'
   double beta, rec, foi, mu, mud, pos, pop, q;
          ^~~~
measSI.c:67:48: note: previous definition of '__p' was
In addition: Warning message:
In system2(command = R.home("bin/R"), args = c("CMD", "SHLIB", "-c",  :
  running command '"C:/PROGRA~1/R/R-40~1.3/bin/R" PKG_CPPFLAGS="-IC:/Users/mayag/Documents/R/win-library/4.0/pomp/include -IC:/Users/mayag/Documents/pomp SI" CMD SHLIB -c -o ./measSI.dll ./measSI.c' had status 1

kingaa commented 4 years ago

The error you got is just due to a name conflict between your parameter beta and the beta function, which is defined in Rmath.h and documented in the R extensions manual. I'd suggest renaming your parameter. I usually use Beta, for example. This one should go into the FAQ.

Debugging is an iterative process: usually code execution stops at the first problem. Fixing it uncovers the second problem, and so on. So I copied your code (and your data, which you sent to me privately) to a new directory and began editing your file in the debugging process. This risks depriving you of an important learning experience, so I hope you go through the code below line by line to see the changes I made. All of these are for a reason: feel free to ask about any of them, though you will probably figure most of them out yourself. For example, you can replace any of my edits by the original line in your code and see what happens.

A couple of remarks about pomp and C are in order.

Inside a C snippet, you should only declare local variables, i.e., temporary variables that you use inside the C snippet. Model parameters, state variables, covariates, and data variables are pre-defined. In addition, there is usually a pre-defined time variable, t, and sometimes a variable lik and a flag give_log are pre-defined. Read about C snippets in the package manual, where I attempt to make all this clear.
More generally, it is important to keep in mind that state variables, covariates, data variables, and model parameters are different kinds of quantities, treated differently in the statistical theory and, therefore, in the algorithms. For each quantity you have, make sure you always bear in mind which of these it is. For example, it doesn't make sense to try to transform covariates or intermediate quantities: only parameters are transformed when partrans is specified.
Some functions that you call in a C snippet come from R's C API. These include, for example, rlnorm and dlnorm. Look up the call sequence for these functions in the R extensions manual.

library(tidyverse)

read_csv("PWS_disease_survey.csv") -> meas

##plot sample prevalence
meas %>%
  ggplot(aes(x=year,y=positive/samplesize))+
  geom_line()+
  geom_point()

##plot mature population
meas %>%
  ggplot(aes(x=year,y=maturepopmean))+
  geom_line()+
  geom_point()

read_csv("PWS_recruitment.csv") %>%
  filter(year>2006,year<2019) -> covartable

##plot recruitment
covartable %>%
  ggplot(aes(x=year,y=recruitment))+
  geom_line()+
  geom_point()

library(pomp)

si_rinit <- Csnippet("
  S = nearbyint(S_0);
  I = nearbyint(pop-S_0);
  H = 0;
 ")

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive,samplesize,I/pop, give_log);
  lik2 = dlnorm(maturepopmean, log(pop), maturepopsd, give_log);
  if (give_log) lik = lik1+lik2; else lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/pop);
  maturepopmean = rlnorm(log(pop), maturepopsd);
 ")

rprocSI <- Csnippet("
  double rate[3], trans[3];

  double foi = Beta*S*I/(pow(pop,q)); //q determines the degree to which frequency or density dependence is occurring (q = 0 for dd, q = 1 for fd)

  rate[0] = foi;  //force of infection
  rate[1] = mu;         // natural death
  rate[2] = mud;         // disease death

  // Recruitment is randomly picked from the distribution provided in the data
  double rec = rnorm(recruitment*dt, recruitmentsd*dt);

  // transitions between classes
  reulermultinom(2,S,&rate[0],dt,&trans[0]);

  S += rec*(1-pos) - trans[0] - trans[1]; //recruits can be disease positive or negative
  I += rec*pos + trans[0] - trans[1] - trans[2];
  I = pop - S;  // NOTE THAT THIS LINE UNDOES THE EFFECT OF THE PREVIOUS LINE.
  H += trans[0];  // true incidence, this may not be necessary to record
")

meas %>%
  pomp(time="year", t0=0,
    rprocess=euler(rprocSI,delta.t=1/12),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    accumvars=c("H"),
    partrans=parameter_trans(
      log=c("pop", "mu", "mud", "Beta"),
      logit=c("pos", "q")
    ),
    covar=covariate_table(covartable,times="year"),
    statenames=c("S", "I", "H"),
    paramnames=c("Beta", "pop", "pos","mu", "mud", "q", "S_0"), 
    cdir=".", cfile="measSI" 
  ) -> measSI

mayagroner commented 4 years ago

Thank you very much! I think much of my confusion was associated with what can and cannot be declared within a Csnippet. I am going through the code now to understand how my errors are related to specific error messages.

One follow-up question is regarding the simulate function. I read through and addressed the suggestions in faq 4.5 (e.g., my covarariates tables extends from t(0) to several timepoint past t(N)), but I am still receiving error messages relating to how the times in the covariate table are read. I receive the following error code when I construct the pomp object.

Warning message:
in ‘pomp’: the supplied covariate times do not embrace the data times: covariates may be extrapolated.

I have set up the recruitment (my covariate) so that it is 0 for all but one of my timepoints within a year to reflect the pulsed nature of fish recruitment. I don't think this is the cause of the problem, because I still receive the error message when I don't do this.

This is the code I have been using (still with the data I sent you privately).

library(tidyverse)
require(readr)

read_csv("PWS_disease_survey.csv") -> meas

##plot sample prevalence
meas %>%
  ggplot(aes(x=year,y=positive/samplesize))+
  geom_line()+
  geom_point()

##plot mature population
meas %>%
  ggplot(aes(x=year,y=maturepopmean))+
  geom_line()+
  geom_point()

read_csv("PWS_recruitment.csv") %>%
  filter(year>2005)-> covartable 

covartable2<-as.data.frame(seq(min(covartable$year), max(covartable$year), by = 1/12))
names(covartable2)[1] <- "year"
covartable2$recruitment<-0
covartable2$recruitmentsd<-0

covartable2$year<-ifelse(covartable2$year-floor(covartable2)==0,NA, covartable2$year)
covartable2<-covartable2[complete.cases(covartable2), ]
covartable<-rbind(covartable, covartable2)
covartable <- covartable[order(covartable$year), ]

##plot recruitment, it should be a single pulse each year
covartable %>%
  ggplot(aes(x=year,y=recruitment))+
  geom_line()+
  geom_point()

library(pomp)

si_rinit <- Csnippet("
  S = nearbyint(S_0);
  I = nearbyint(pop-S_0);
  H = 0;
 ")

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive,samplesize,I/pop, give_log);
  lik2 = dlnorm(maturepopmean, log(pop), maturepopsd, give_log);
  if (give_log) lik = lik1+lik2; else lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/pop);
  maturepopmean = rlnorm(log(pop), maturepopsd);
 ")

rprocSI <- Csnippet("
  double rate[3], trans[3], rec;

  double foi = Beta*S*I/(pow(pop,q)); //q determines the degree to which frequency or density dependence is occurring (q = 0 for dd, q = 1 for fd)

  rate[0] = foi;  //force of infection
  rate[1] = mu;         // natural death
  rate[2] = mud;         // disease death

  // Recruitment is randomly picked from the distribution provided in the data

    rec = rnorm(recruitment, recruitmentsd);

  // transitions between classes
  reulermultinom(2,S,&rate[0],dt,&trans[0]);

  S += rec*(1-pos) - trans[0] - trans[1]; //recruits can be disease positive or negative
  I += rec*pos + trans[0] - trans[1] - trans[2];
  H += trans[0];  // true incidence, this may not be necessary to record
")

meas %>%
  pomp(time="year", t0=0,
       rprocess=euler(rprocSI,delta.t=1/12),
       rinit=si_rinit,
       rmeasure=si_rmeas,
       dmeasure=si_dmeas,
       accumvars=c("H"),
       partrans=parameter_trans(
         log=c("pop", "mu", "mud", "Beta"),
         logit=c("pos", "q")
       ),
       covar=covariate_table(covartable,times="year"),
       statenames=c("S", "I", "H"),
       paramnames=c("Beta", "pop", "pos","mu", "mud", "q", "S_0"), 
       cdir=".", cfile="measSI" 
  ) -> measSI

measSI %>%
  simulate(
    params=c(Beta=1.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, q=0.8, S_0=103871215),
    nsim=20,format="data.frame",include.data=TRUE
  ) -> sims

kingaa commented 4 years ago

There are quite a number of interesting issues that come up in this context. These range from simple matters to do with the code to more delicate issues to do with the way you think about the data.

Have a look at the following codes, which we can discuss.

library(tidyverse)
require(readr)

read_csv("PWS_disease_survey.csv") %>%
  mutate(
    msd = log(1+maturepopsd/maturepopmean)
  ) %>%
  select(-maturepopsd,-negative) -> meas

I'm assuming a log-normal model for the errors in population-size estimates and recruitment.

##plot sample prevalence
meas %>%
  ggplot(aes(
    x=year,y=positive/samplesize,
    ymin=qbeta(p=0.25,shape1=positive,shape2=negative+1),
    ymax=qbeta(p=0.75,shape1=positive+1,shape2=negative)))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs(y="prevalence")

##plot mature population
meas %>%
  ggplot(aes(
    x=year,y=maturepopmean,
    ymin=qlnorm(p=0.25,meanlog=log(maturepopmean),sdlog=msd),
    ymax=qlnorm(p=0.75,meanlog=log(maturepopmean),sdlog=msd)
  ))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs("mature population size")

In your model, recruitment and its variability are really covariates, not data. So is the sample-size and the s.d. of the population size estimator. Accordingly, I put these into the covariate table, and remove them from the data.

read_csv("PWS_recruitment.csv") %>%
  filter(year>2005) %>%
  mutate(
    rsd=log(1+recruitmentsd/recruitment)
  ) %>%
  select(-recruitmentsd) %>%
  left_join(
    meas %>%
      select(year,samplesize,msd),
    by="year"
  ) -> covartable

meas %>%
  select(-samplesize,-msd) -> meas

covartable %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qlnorm(p=0.25,meanlog=log(recruitment),sdlog=rsd),
      ymax=qlnorm(p=0.75,meanlog=log(recruitment),sdlog=rsd)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()

Now, you want your recruitment to happen in a pulse, once per year. In the following, this pulse arrives during the month of January.

covartable %>%
  select(year) %>%
  expand_grid(month=1:12) %>%
  mutate(
    time=year+(month-1)/12
  ) %>%
  select(-month,-year) %>%
  left_join(covartable,by=c("time"="year")) %>%
  rename(year=time) %>%
  mutate(
    recruitment=coalesce(recruitment,0),
    rsd=coalesce(rsd,0),
    samplesize=coalesce(samplesize,0),
    msd=coalesce(msd,0)
  ) %>%
  arrange(year) -> covartable

##plot recruitment, it should be a single pulse each year
covartable %>%
  ggplot(aes(x=year,y=recruitment))+
  geom_step()+
  geom_point()

In debugging your code, I reparameterized the initial condition.

library(pomp)

si_rinit <- Csnippet("
  S = nearbyint(pop*sfrac);
  I = nearbyint(pop*(1-sfrac));
 ")

The measurement model below conforms (I think) more closely to the model you have in mind.

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive, samplesize, I/(S+I), give_log);
  lik2 = dlnorm(maturepopmean, log(S+I), msd, give_log);
  if (give_log)
    lik = lik1+lik2;
  else
    lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/(S+I));
  maturepopmean = rlnorm(log(S+I), msd);
 ")

Read up on the Euler-multinomial distribution and what it is useful for. For example, here and here.

I've dispensed, for the moment, with your frequency- to density-dependent continuum. It's an interesting idea. I would recommend giving some serious thought to how you parameterize this continuum.

rprocSI <- Csnippet("
  double rate[4], trans[4], rec;

  double foi = Beta*I/(I+S);

  rate[0] = foi;  //force of infection
  rate[1] = mu;   // natural death
  rate[2] = mu;   // natural death
  rate[3] = mud;  // disease death

  // Recruitment is randomly picked from the distribution provided in the data
  rec = rlnorm(log(recruitment), rsd);

  // transitions between classes
  reulermultinom(2,S,&rate[0],dt,&trans[0]);
  reulermultinom(2,I,&rate[2],dt,&trans[2]);

  //recruits can be disease positive or negative
  S += nearbyint(rec*(1-pos)) - trans[0] - trans[1];
  I += nearbyint(rec*pos) + trans[0] - trans[2] - trans[3];
")

Building the pomp object, note the choice of t0.

meas %>%
  pomp(
    time="year", t0=2006,
    rprocess=euler(rprocSI,delta.t=1/120),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    covar=covariate_table(covartable,times="year",order="constant"),
    statenames=c("S", "I"),
    paramnames=c("Beta", "pop", "pos", "mu", "mud", "sfrac"),
    cdir=".", cfile="measSI"
  ) -> measSI

measSI %>%
  as_tibble() %>%
  gather(var,val,-year) %>%
  ggplot(aes(x=year,y=val,group=var))+
  geom_line()+
  facet_wrap(~var,scales="free_y")

Let's examine a simulation.

measSI %>%
  simulate(
    nsim=1,
    params=c(
      Beta=0.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, sfrac=0.7
    )
  )  %>%
  as_tibble() %>%
  gather(var,val,-year) %>%
  ggplot(aes(x=year,y=val,group=var))+
  geom_line()+
  facet_wrap(~var,scales="free_y")

The particle filter reveals that these parameters are not very good.

measSI %>%
  pfilter(
    Np=1000,
    params=c(
      Beta=0.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, sfrac=0.7
    )
  ) %>%
  plot()

IF2 is usually pretty good at rapidly improving parameters. The following runs 50 IF2 iterations, exploring a 4-parameter subspace of the parameter space.

measSI %>%
  mif2(
    Nmif=50,
    Np=1000,
    params=c(
      Beta=0.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, sfrac=0.7
    ),
    partrans=parameter_trans(
      log=c("Beta","mu","mud"),
      logit="pos"
    ),
    paramnames=c("Beta","mu","mud","pos"),
    rw.sd=rw.sd(Beta=0.02,pos=ivp(0.02),mu=0.02,mud=0.02),
    cooling.fraction.50=0.5
  ) -> mf1

mf1 %>%
  traces() %>%
  melt() %>%
  ggplot(aes(x=iteration,y=value,group=variable))+
  geom_line()+
  facet_wrap(~variable,scales="free_y")

mf1 %>% coef()

What do simulations at these parameters look like?

mf1 %>%
  simulate(
    nsim=20,
    format="data.frame",include.data=TRUE
  ) -> sims

sims %>%
  select(year,rep=.id,maturepopmean,positive,samplesize) %>%
  gather(var,val,-year,-rep) %>%
  ggplot(aes(x=year,y=val,group=rep,color=rep=="data"))+
  geom_line()+
  facet_wrap(~var,scales="free_y",ncol=1)

kingaa commented 4 years ago

The big issue that deserves to be addressed is whether the demographic stochasticity you assume (by using Euler-multinomial distributions with deterministic rates for the infection and death processes) is really sufficient to explain the data.

mayagroner commented 4 years ago

Thank you so much for helping me to get this running. I think it may make sense to try using running the model with and without demographic stochasticity. This is a dynamic system and many of the sources of natural mortality fluctuate across years (e.g., predation, resource availability); these leads me to believe that an SDE will ultimately work better. As a first step, I have coded a deterministic model (below). I have offset the timing of the population measurement to reflect that the survey is done in April. Recruitment is still estimated to occur in January.

In both the SDE and deterministic models, the estimates for mortality are much higher than I would expect. We believe that the natural mortality rate is between 0.2 and 0.5, however the models are estimating values around 4. I don't know if this may be because of the way that the model is adding the recruitment? The value I have for recruitment is the total recruitment we expect in a year, but we have limited it to the month of January. Is it possible that the total value of the recruitment is added in each delta.t time step? This is my code for recruitment in the process model:

// Recruitment is randomly picked from the distribution provided in the data
  double rec = rlnorm(log(recruitment), rsd);

I've also tried coding it as below, without any impact on the results:

 double rec = rlnorm(log(recruitment*dt), rsd*dt);

Here is my full deterministic model:

library(tidyverse)
require(readr)
require(dplyr)
read_csv("PWS_disease_survey.csv") %>%
  mutate(
    msd = log(1+maturepopsd/maturepopmean)
  ) -> meas

meas$year<-meas$year+1/3

## add 0.33 to each year to reflect that the population survey occurs in April (note that recruitment still occurs in January)

## plot sample prevalence
meas %>%
  ggplot(aes(
    x=year,y=positive/samplesize,
    ymin=qbeta(p=0.25,shape1=positive,shape2=negative+1),
    ymax=qbeta(p=0.75,shape1=positive+1,shape2=negative)))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs(y="prevalence")+xlim(2006, 2018)+theme_bw()

##plot mature population
meas %>%
  ggplot(aes(
    x=year,y=maturepopmean,
    ymin=qlnorm(p=0.25,meanlog=log(maturepopmean),sdlog=msd),
    ymax=qlnorm(p=0.75,meanlog=log(maturepopmean),sdlog=msd)
  ))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs("mature population size")+xlim(2006, 2018)+theme_bw()

read_csv("PWS_recruitment.csv") %>%
  filter(year>2005) %>%
  mutate(
    rsd=log(1+recruitmentsd/recruitment)
  ) %>%
  select(-recruitmentsd) -> covartable

covs<-select(meas, c(samplesize, msd, year))
covartable<-merge(covartable, covs, all=TRUE)

meas %>%
  select(-samplesize,-msd) -> meas
covartable %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qlnorm(p=0.25,meanlog=log(recruitment),sdlog=rsd),
      ymax=qlnorm(p=0.75,meanlog=log(recruitment),sdlog=rsd)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+xlim(2006, 2018)+theme_bw()

## Now, you want your recruitment to happen in a pulse, once per year.
## In the following, this pulse arrives during the month of January.

covartable %>%
  select(year) %>%
  expand_grid(month=1:12) %>%
  mutate(
    time=year+(month-1)/12
  ) %>%
  select(-month,-year) %>%
  left_join(covartable,by=c("time"="year")) %>%
  mutate(
    recruitment=coalesce(recruitment,0),
    rsd=coalesce(rsd,0),
    samplesize=coalesce(samplesize,0),
    msd=coalesce(msd,0)
  ) %>%
  arrange(time)-> covartable

covartable$year<-covartable$time
covartable%>%
  select(-time)-> covartable
covartable$year<-signif(covartable$year, digits = 8)
covartable<-covartable[!duplicated(covartable$year),]

## plot recruitment, it should be a single pulse each year
covartable %>%
  ggplot(aes(x=year,y=recruitment))+
  geom_step()+
  geom_point()

library(pomp)

si_rinit <- Csnippet("
  S = nearbyint(pop*sfrac);// pop is initial population size
  I = nearbyint(pop*(1-sfrac));
 ")

si_dmeas <- Csnippet("
  double lik1, lik2;
  lik1 = dbinom(positive, samplesize, I/(S+I), give_log);
  lik2 = dlnorm(maturepopmean, log(S+I), msd, give_log);
  if (give_log)
    lik = lik1+lik2;
  else
    lik = lik1*lik2;
 ")

si_rmeas <- Csnippet("
  positive = rbinom(samplesize,I/(S+I));
  maturepopmean = rlnorm(log(S+I), msd);
 ")

rprocSI <- Csnippet("
  // Recruitment is randomly picked from the distribution provided in the data
  double rec = rlnorm(log(recruitment), rsd);

  //transition rates
  double foi = rbinom(S,1-exp(-Beta*(I)/(S+I)*dt));

  //mortality rates
  double MUS = rbinom(S,1-exp(-mu*dt));
  double MUI = rbinom(I,1-exp(-mu*dt));
  double MUD = rbinom(I,1-exp(-mud*dt));

  //recruits can be disease positive or negative

  S += nearbyint(rec*(1-pos)) - foi - MUS; 
  I += nearbyint(rec*pos) + foi - MUI - MUD; 

")

meas %>%
  pomp(
    time="year", t0=2006,
    rprocess=euler(rprocSI,delta.t=1/120),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    covar=covariate_table(covartable,times="year",order="constant"),
    statenames=c("S", "I"),
    paramnames=c("Beta", "pop", "pos", "mu", "mud", "sfrac"),
    cdir=".", cfile="measSI"
  ) -> measSI

measSI %>%
  as_tibble() %>%
  gather(var,val,-year) %>%
  ggplot(aes(x=year,y=val,group=var))+
  geom_line()+
  facet_wrap(~var,scales="free_y")

##Let's examine a simulation.

measSI %>%
  simulate(
    nsim=1,
    params=c(
      Beta=0.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, sfrac=0.7
    )
  )  %>%
  as_tibble() %>%
  gather(var,val,-year) %>%
  ggplot(aes(x=year,y=val,group=var))+
  geom_line()+
  facet_wrap(~var,scales="free_y")

  measSI %>%
  pfilter(
    Np=1000,
    params=c(
      Beta=0.5, pop=559213100, pos=0.15, mu=0.25, mud=0.25, sfrac=0.7
    )
  ) %>%
  plot()

## IF2 is usually pretty good at rapidly improving parameters. 
## The following runs 50 IF2 iterations, exploring a 4-parameter subspace of the parameter space.

measSI %>%
  mif2(
    Nmif=100,
    Np=1000,
    params=c(
      Beta=0.3, pop=100000000, pos=0.15, mu=1.3, mud=0.4, sfrac=0.55
    ),
    partrans=parameter_trans(
      log=c("Beta","mu", "mud", "pop"),
      logit=c("pos", "sfrac")
    ),
    paramnames=c("Beta","mu", "mud","pop", "pos", "sfrac"),
    rw.sd=rw.sd(Beta=0.02,pos=ivp(0.02),mu=0.02,mud=0.02, pop=0.02, sfrac=ivp(0.02)),
    #rw.sd=rw.sd(Beta=0.02,pos=ivp(0.02),mud=0.02),
    cooling.fraction.50=0.5
  ) -> mf1

mf1 %>%
  traces() %>%
  melt() %>%
  ggplot(aes(x=iteration,y=value,group=variable))+
  geom_line()+
  facet_wrap(~variable,scales="free_y")

  mf1 %>% coef()

  mf1 %>%
  simulate(
    nsim=20,
    format="data.frame",include.data=TRUE
  ) -> sims

Here are 20 simulations after 100 rounds of particle filtering. You can see how high the estimates for mu (natural death rate over a year) are.

sims %>%
  select(year,rep=.id,maturepopmean,positive,samplesize) %>%
  gather(var,val,-year,-rep) %>%
  ggplot(aes(x=year,y=val,group=rep,color=rep=="data"))+
  geom_line()+
  facet_wrap(~var,scales="free_y",ncol=1)

mayagroner commented 4 years ago

I should add to this that I have tried fixing mu at a more plausible value and in those cases the estimates for the disease mortality are about an order of magniture higher than I would expect based on our experimental work.

kingaa commented 4 years ago

I think your idea of using SDE makes a lot of sense in this case. For the reasons you mention, the stochasticity is not likely to be small. Rather than view this as removing demographic stochasticity, it's a question of adding environmental stochasticity. The two are certainly not mutually exclusive.

Probably the best way forward is to write SDE for the counting processes that track infections, S-deaths, and I-deaths, then to combine these into SDE for the S & I state variables. This is discussed in our short course.

A few other points:

The model you implement above is not deterministic. Witness the use of rbinom, which makes random draws from a binomial distribution. Indeed, it is not far from the Euler-multinomial implementation you had originally, with the added drawback that it is possible for some individuals to both die and become infected, or to die in two ways at once.
You are absolutely on-point about the error in the way recruitment is handled. It doesn't make any sense at all, the way I have coded it. Properly, we should put in the recruitment rate, and work out the number of recruits in an Euler step. Your suggestion is almost, but not quite right. Some thought needs to go into the choice of the distribution of recruitments.

kingaa commented 4 years ago

The issue I mentioned just above, to do with the recruitment, is the following. If we situate our dynamical system in continuous time (as we do when we use euler), then we need to describe the random number of recruits that arrive within an arbitrary time-interval. That is, if R(t₁,t₂) is the random number of recruits arriving between times t₁ and t₂, then we should insist that R(t₁,t₂) + R(t₂,t₃) = R(t₁,t₃) whenever t₁ < t₂ < t₃. The log-normal distribution we've been assuming does not have this property.

I propose therefore, that we should assume R(t₁,t₂) ~ Gamma(r(t₁),σ(t₁)), where Gamma(r,σ) indicates the Gamma distribution with mean r and s.d. σ, the latter two quantities being covariates.

library(tidyverse)
require(readr)
require(dplyr)

read_csv("PWS_disease_survey.csv") %>%
  mutate(
    msd = log(1+maturepopsd/maturepopmean),
    month=4
  ) -> meas

## plot sample prevalence
meas %>%
  ggplot(aes(
    x=year,y=positive/samplesize,
    ymin=qbeta(p=0.25,shape1=positive,shape2=negative+1),
    ymax=qbeta(p=0.75,shape1=positive+1,shape2=negative)))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs(y="prevalence")+xlim(2006, 2018)+theme_bw()

## plot mature population
meas %>%
  ggplot(aes(
    x=year,y=maturepopmean,
    ymin=qlnorm(p=0.25,meanlog=log(maturepopmean),sdlog=msd),
    ymax=qlnorm(p=0.75,meanlog=log(maturepopmean),sdlog=msd)
  ))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs("mature population size")+xlim(2006, 2018)+theme_bw()

read_csv("PWS_recruitment.csv") %>%
  filter(year>2005) %>%
  mutate(
    rvar=recruitmentsd*recruitmentsd,
    month=1
  ) %>%
  select(-recruitmentsd) %>%
  bind_rows(
    meas %>%
      select(year,month,samplesize,msd)
  ) %>%
  arrange(year,month) -> covartable

meas %>%
  select(year,month,positive,maturepopmean) -> meas

covartable %>%
  select(year,recruitment,rvar) %>%
  filter(!is.na(rvar)) %>%
  mutate(
    shape=recruitment*recruitment/rvar,
    scale=rvar/recruitment
  ) %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qgamma(p=0.05,shape=shape,scale=scale),
      ymax=qgamma(p=0.95,shape=shape,scale=scale)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+xlim(2006, 2018)+theme_bw()

covartable %>%
  select(year) %>%
  distinct() %>%
  expand_grid(month=1:12) %>%
  left_join(covartable,by=c("year","month")) %>%
  mutate(
    year=year+month/12,
    recruitment=coalesce(recruitment,0),
    rvar=coalesce(rvar,0),
    samplesize=coalesce(samplesize,0),
    msd=coalesce(msd,0)
  ) %>%
  select(-month) %>%
  arrange(year) -> covartable

library(pomp)

si_rinit <- Csnippet("
S = nearbyint(pop*sfrac);// pop is initial population size
I = nearbyint(pop*(1-sfrac));
")

si_dmeas <- Csnippet("
double lik1, lik2;
lik1 = dbinom(positive, samplesize, I/(S+I), give_log);
lik2 = dlnorm(maturepopmean, log(S+I), msd, give_log);
if (give_log)
  lik = lik1+lik2;
else
  lik = lik1*lik2;
")

si_rmeas <- Csnippet("
positive = rbinom(samplesize,I/(S+I));
maturepopmean = rlnorm(log(S+I), msd);
")

rprocSI <- Csnippet("
// recruitment rate
double scale, shape, rec;
if (recruitment > 0) {
  scale = rvar/recruitment;
  shape = recruitment/scale*dt;
  rec = rgamma(shape,scale);
} else {
  rec = 0;
}

double rate[3], trans[3];
rate[0] = Beta*I/(S+I); // force of infection
rate[1] = mu;
rate[2] = mu+mud;
reulermultinom(2,S,&rate[0],rgammawn(1,dt),&trans[0]);
reulermultinom(1,I,&rate[2],rgammawn(1,dt),&trans[2]);
S += nearbyint(rec*(1-pos)) - trans[0] - trans[1];
I += nearbyint(rec*pos) + trans[0] - trans[2];
")

meas %>%
  mutate(year=year+month/12) %>%
  select(-month) %>%
  pomp(
    time="year", t0=2006+1/12,
    rprocess=euler(rprocSI,delta.t=1/360),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    covar=covariate_table(covartable,times="year",order="constant"),
    statenames=c("S", "I"),
    paramnames=c("Beta", "pop", "pos", "mu", "mud", "sfrac")
  ) -> measSI

measSI %>%
  mif2(
    Nmif=100,
    Np=1000,
    params=c(
      Beta=0.5, pop=148387450, pos=0.2, mu=0.25, mud=0.25, sfrac=0.7
    ),
    partrans=parameter_trans(
      log=c("Beta","mu","mud"),
      logit=c("pos")
    ),
    paramnames=c("Beta","mu", "mud","pop", "pos", "sfrac"),
    rw.sd=rw.sd(Beta=0.02,pos=0.02,mu=0.02,mud=0.02),
    cooling.fraction.50=0.5
  ) -> mf1

mf1 %>%
  traces() %>%
  melt() %>%
  ggplot(aes(x=iteration,y=value,group=variable))+
  geom_line()+
  facet_wrap(~variable,scales="free_y")

mf1 %>% coef()

mf1 %>%
  simulate(
    nsim=20,
    format="data.frame",include.data=TRUE
  ) -> sims

sims %>%
  select(year,rep=.id,maturepopmean,positive,samplesize) %>%
  gather(var,val,-year,-rep) %>%
  ggplot(aes(x=year,y=val,group=rep,color=rep=="data"))+
  geom_line()+
  facet_wrap(~var,scales="free_y",ncol=1)

mayagroner commented 3 years ago

Thank you. That is working and the model is now giving realistic values for natural mortality. I see that you also included additional stochasticity in these lines:

reulermultinom(2,S,&rate[0],rgammawn(1,dt),&trans[0]);
reulermultinom(1,I,&rate[2],rgammawn(1,dt),&trans[2]);

Do you usually use sigma=1 in the rgammawn() function?

kingaa commented 3 years ago

No. Typically, there is an additional parameter (or two in this case) to estimate. I hesitated to go any farther.

mayagroner commented 3 years ago

In the measles example in part 5 of your SISMID course, you have included sigmaSE as a parameter to estimate in your process model. Would you recommend taking this approach?

dw = rgammawn ( sigmaSE , dt );

In that case rgammawn() is not included in the reulermultinom() function, and W is a variable that is tracked over time:

W += ( dw - dt )/ sigmaSE ; // standardized i . i . d . white noise

mayagroner commented 3 years ago

I have tried including extrademographic stochasticity in the natural mortality rate (mu). I think that this is probably the parameter that has the greatest variance.

rprocSI <- Csnippet("
// recruitment rate
double scale, shape, rec;
if (recruitment > 0) {
  scale = rvar/recruitment;
  shape = recruitment/scale*dt;
  rec = rgamma(shape,scale);
} else {
  rec = 0;
}

//extrademographic stochasticity in natural mortality mu
double dw = rgammawn (sigmaSE , dt );

double rate[3], trans[3];
rate[0] = Beta*I/(S+I); // force of infection
rate[1] = mu*dw/dt;
rate[2] = mu*dw/dt+mud;
reulermultinom(2,S,&rate[0],rgammawn(1,dt),&trans[0]);
reulermultinom(1,I,&rate[2],rgammawn(1,dt),&trans[2]);
S += nearbyint(rec*(1-pos)) - trans[0] - trans[1];
I += nearbyint(rec*pos) + trans[0] - trans[2];
W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
")

mayagroner commented 3 years ago

There were a few mistakes in the code above (e.g., I forgot to take out the white noise from the reulermultinom() function in the process model. With extrademographic stochasticity around 'mu', the simulations are a lot closer to the data, although the population size estimates have a huge range. (sigmaSE is estimated at ~2 in the local search of parameter space that I conducted).

library(tidyverse)
require(readr)
require(dplyr)
setwd("C:/Users/mayag/Dropbox/Ichthyophonus/Ich mort model/pomp model")

read_csv("PWS_disease_survey.csv") %>%
  mutate(
    msd = log(1+maturepopsd/maturepopmean),
    month=4
  ) -> meas

## plot sample prevalence
meas %>%
  ggplot(aes(
    x=year,y=positive/samplesize,
    ymin=qbeta(p=0.25,shape1=positive,shape2=negative+1),
    ymax=qbeta(p=0.75,shape1=positive+1,shape2=negative)))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs(y="prevalence")+xlim(2006, 2018)+theme_bw()

## plot mature population
meas %>%
  ggplot(aes(
    x=year,y=maturepopmean,
    ymin=qlnorm(p=0.25,meanlog=log(maturepopmean),sdlog=msd),
    ymax=qlnorm(p=0.75,meanlog=log(maturepopmean),sdlog=msd)
  ))+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+
  labs("mature population size")+xlim(2006, 2018)+theme_bw()

read_csv("PWS_recruitment.csv") %>%
  filter(year>2005) %>%
  mutate(
    rvar=recruitmentsd*recruitmentsd,
    month=1
  ) %>%
  select(-recruitmentsd) %>%
  bind_rows(
    meas %>%
      select(year,month,samplesize,msd)
  ) %>%
  arrange(year,month) -> covartable

meas %>%
  select(year,month,positive,maturepopmean) -> meas

covartable %>%
  select(year,recruitment,rvar) %>%
  filter(!is.na(rvar)) %>%
  mutate(
    shape=recruitment*recruitment/rvar,
    scale=rvar/recruitment
  ) %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qgamma(p=0.05,shape=shape,scale=scale),
      ymax=qgamma(p=0.95,shape=shape,scale=scale)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+xlim(2006, 2018)+theme_bw()

covartable %>%
  select(year) %>%
  distinct() %>%
  expand_grid(month=1:12) %>%
  left_join(covartable,by=c("year","month")) %>%
  mutate(
    year=year+month/12,
    recruitment=coalesce(recruitment,0),
    rvar=coalesce(rvar,0),
    samplesize=coalesce(samplesize,0),
    msd=coalesce(msd,0)
  ) %>%
  select(-month) %>%
  arrange(year) -> covartable

#meas$year<-meas$year+1/3

library(pomp)

si_rinit <- Csnippet("
S = nearbyint(pop*sfrac);// pop is initial population size
I = nearbyint(pop*(1-sfrac));
")

si_dmeas <- Csnippet("
double lik1, lik2;
lik1 = dbinom(positive, samplesize, I/(S+I), give_log);
lik2 = dlnorm(maturepopmean, log(S+I), msd, give_log);
if (give_log)
  lik = lik1+lik2;
else
  lik = lik1*lik2;
")

si_rmeas <- Csnippet("
positive = rbinom(samplesize,I/(S+I));
maturepopmean = rlnorm(log(S+I), msd);
")

rprocSI <- Csnippet("
// recruitment rate
double scale, shape, rec;
if (recruitment > 0) {
  scale = rvar/recruitment;
  shape = recruitment/scale*dt;
  rec = rgamma(shape,scale);
} else {
  rec = 0;
}

//extrademographic stochasticity in natural mortality mu
double dw = rgammawn (sigmaSE , dt );

double rate[3], trans[3];
rate[0] = Beta*I/(S+I); // force of infection
rate[1] = mu*dw/dt;
rate[2] = mu*dw/dt+mud;
reulermultinom(2,S,&rate[0], dt, &trans[0]);
reulermultinom(1,I,&rate[2], dt, &trans[2]);
S += nearbyint(rec*(1-pos)) - trans[0] - trans[1];
I += nearbyint(rec*pos) + trans[0] - trans[2];
W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
")

meas %>%
  mutate(year=year+month/12) %>%
  select(-month) %>%
  pomp(
    time="year", t0=2006+1/12,
    rprocess=euler(rprocSI,delta.t=1/360),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    covar=covariate_table(covartable,times="year",order="constant"),
    statenames=c("S", "I", "W"),
    paramnames=c("Beta", "pop", "pos", "mu", "mud", "sfrac", "sigmaSE"),
    cdir=".", cfile="measSI"
  ) -> measSI

#local search of likelihood surface
measSI %>%
  mif2(
    Nmif=50,
    Np=1000,
    params=c(
      Beta=0.326, pop=148387450, pos=0.127, mu=0.215, mud=0.203, sfrac=0.651, sigmaSE=1
    ),
    partrans=parameter_trans(
      log=c("Beta","mu","mud","sigmaSE","pop"),
      logit=c("pos", "sfrac")
    ),
    paramnames=c("Beta","mu", "mud","pop", "pos", "sfrac", "sigmaSE"),
    rw.sd=rw.sd(Beta=0.02,pos=0.02,mu=0.02,mud=0.02, sfrac=0.02, pop=0.2, sigmaSE=0.2),
    cooling.fraction.50=0.5
  ) -> mf1
mifs_local<-mf1

mf1 %>%
  traces() %>%
  melt() %>%
  ggplot(aes(x=iteration,y=value,group=variable))+
  geom_line()+
  facet_wrap(~variable,scales="free_y")

mf1 %>% coef()

mf1 %>%
  simulate(
    nsim=20,
    format="data.frame",include.data=TRUE
  ) -> sims

sims %>%
  select(year,rep=.id,maturepopmean,positive,samplesize) %>%
  gather(var,val,-year,-rep) %>%
  ggplot(aes(x=year,y=val,group=rep,color=rep=="data"))+
  geom_line()+
  facet_wrap(~var,scales="free_y",ncol=1)

mayagroner commented 3 years ago

I have tried running my model in parallel, but I am getting a similar error message as that of Issue #96. I have specified cdir=".", cfile="measSI" outside of the parallel call here:

meas %>%
  mutate(year=year+month/12) %>%
  select(-month) %>%
  pomp(
    time="year", t0=2006+1/12,
    rprocess=euler(rprocSI,delta.t=1/360),
    rinit=si_rinit,
    rmeasure=si_rmeas,
    dmeasure=si_dmeas,
    covar=covariate_table(covartable,times="year",order="constant"),
    statenames=c("S", "I", "W"),
    paramnames=c("Beta", "pop", "pos", "mu", "mud", "sfrac", "sigmaSE"),
    cdir=".", cfile="measSI"
  ) -> measSI

When I explore the local parameter space in parallel code (below), I get the following error message (note that all other relevant code is in the previous message).

foreach(i=1:4,.combine=c) %dopar% {
  library(pomp)
  library(tidyverse)
  measSI %>%
    mif2(
      Np=2000, Nmif=50,
      params=c(
        Beta=1.75, pop=148387450, pos=0.06, mu=0.16, mud=0.20, sfrac=0.66, sigmaSE=1.4
      ),
      partrans=parameter_trans(
        log=c("Beta","mu","mud", "sigmaSE"),
        logit=c("pos")
      ),
      paramnames=c("Beta","mu", "mud","pop", "pos", "sfrac", "sigmaSE"),
      rw.sd=rw.sd(Beta=0.02,pos=0.02,mu=0.02,mud=0.02, sfrac=0.02, pop=0.02, sigmaSE=.02),
      cooling.fraction.50=0.5
    )
} -> mifs_local

Error in { : 
  task 1 failed - "in 'mif2': unable to load shared object 'C:/Users/mgroner/AppData/Local/Temp/2/RtmpGGSd3v/10248/pomp_29419b0a804249a44deeaa2d34653453.dll':
  LoadLibrary failure:  The specified module could not be found.

kingaa commented 3 years ago

The inclusion of the partrans argument triggers a C snippet build. For some reason I do not yet understand, C snippet builds in parallel codes on Windows machines fail. To work around this, include the partrans argument outside of the foreach loop.

kingaa commented 3 years ago

Not seeing any questions, I am going to close this issue now. Feel free to reopen if more discussion is warranted.

mayagroner commented 3 years ago

When I do a global search of parameter space, I am getting 'illegal' parameter values despite having transformed my parameters. Based on the error, I think that the issue may be related to the addition of white noise to the mortality rate; the problem doesn't occur if I run the model without the extrademographic stochasticity, though it is also possible that the model with the ED stochasticity is just revealing an existing problem. I've searched my code repeatedly and cannot find a source for the issue. Do you know what might be causing the problem? Many thanks for your help.

Here is the error message:

Error in { : 
  task 7 failed - "in 'mif2': 'dmeasure' with log=TRUE returns illegal value.
Log likelihood, data, states, and parameters are:
         time:      2014.33
       loglik:          NaN
     positive:           48
maturepopmean:  8.08965e+07
            S:            0
            I:            0
            W:      13.0358
         Beta:     0.208696
          pop:  1.95103e+08
          pos:     0.135647
           mu:     0.368183
          mud:     0.246813
        sfrac:     0.656438
      sigmaSE:      4.89772

And the code (which utilizes the data I previously sent you):

si_dmeas <- Csnippet("
double lik1, lik2;
lik1 = dbinom(positive, samplesize, I/(S+I), give_log);
lik2 = dlnorm(maturepopmean, log(S+I), msd, give_log);
if (give_log)
  lik = lik1+lik2;
else
  lik = lik1*lik2;
")

kingaa commented 3 years ago

Looks like you have at least one realization for which S = I = 0. In this case, the population is extinct. That's incompatible with the data, so it deserves a likelihood of zero, but it gets a likelihood of NaN, which causes the error. In other words, when I = S = 0, I/(S+I) is NaN and dbinom(positive, samplesize, I/(S+I), give_log) gives NaN as well. If you trap for S+I=0 and return 0 for the likelihood (-Inf for the log likelihood), then the algorithm can proceed.

On the other hand, dlnorm(maturepopmean, log(S+I), msd, give_log) gives 0 or -Inf for the likelihood and log likelihood, respectively, so that's OK.

mayagroner commented 3 years ago

Hello Aaron. I am converting my simple SI model to an age-structured model which focuses on ages three through six plus (herring recruit to the mature population at age three, and I am lumping individuals 6 and older). I am getting NaN for the populations in my simulations. I believe this is due to the way I have modeled transitions between age classes. I would like to structure these transitions so that they happen in yearly pulses, which coincides with the yearly recruitment pulse of new age threes entering the mature population. As this is not a stochastic process (e.g., all individuals should age one year each year), I have written the code in my process model so that the transition rate is 1/dt when recruitment is >0 and 0 otherwise. I'm not sure if this might be causing my populations to go below 0 in my simulations? Do you have any suggestions for how I might code this differently? I will email you the relevant (confidential) dataset. Many thanks for your help.

library(tidyverse)
require(readr)
require(dplyr)
library (pomp)

read_csv("PWS for pomp with age structure_recruit at three.csv") %>%
  mutate(
    month=4,
  ) -> meas

## plot mature population
meas %>% tidyr::gather("id", "value", popmean_three:popmean_six) ->meas1
meas1$id<-factor(as.factor(meas1$id), levels=c("popmean_three", "popmean_four","popmean_five", "popmean_six"))
ggplot(meas1, aes(year, value, color=id))+geom_point()+geom_line()+ scale_color_manual(values=c("red", "orange", "light green", "dark green"))+ylab("herring population")

read_csv("PWS_recruitment_allthrees.csv") %>%
  filter(year>2005) %>%
  mutate(
    rvar=recruitmentsd*recruitmentsd,
    month=1
  )%>%
  select(-recruitmentsd) %>%
  bind_rows(
    meas %>%
      select(year,month,samplesize_three,samplesize_four, samplesize_five, samplesize_six, logpopsd_three, logpopsd_four, logpopsd_five, logpopsd_six)
  ) %>%
  arrange(year,month)  -> covartable

meas %>%
  select(year,month,positive_three, positive_four, positive_five, positive_six, logpopmean_three, logpopmean_four, logpopmean_five, logpopmean_six) -> meas

covartable %>%
  select(year,recruitment,rvar) %>%
  filter(!is.na(rvar)) %>%
  mutate(
    shape=recruitment*recruitment/rvar,
    scale=rvar/recruitment
  ) %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qgamma(p=0.05,shape=shape,scale=scale),
      ymax=qgamma(p=0.95,shape=shape,scale=scale)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+xlim(2006, 2018)+theme_bw()

covartable %>%
  select(year) %>%
  distinct() %>%
  expand_grid(month=1:12) %>%
  left_join(covartable,by=c("year","month")) %>%
  mutate(
    year=year+month/12,
    recruitment=coalesce(recruitment,0),
    rvar=coalesce(rvar,0),
    samplesize_three=coalesce(samplesize_three,0),
    samplesize_four=coalesce(samplesize_four,0),
    samplesize_five=coalesce(samplesize_five,0),
    samplesize_six=coalesce(samplesize_six,0),
    logpopsd_three=coalesce(logpopsd_three,0),
    logpopsd_four=coalesce(logpopsd_four,0),
    logpopsd_five=coalesce(logpopsd_five,0),
    logpopsd_six=coalesce(logpopsd_six,0)
  ) %>%
  select(-month) %>%
  arrange(year) -> covartable

covartable<-covartable[!is.na(covartable$year),]
meas<-meas[!is.na(meas$year),]

library(pomp)

si_rinit <- Csnippet("
S3 = nearbyint(pop3*sfrac3);// pop3 is initial population size for age 3, sfrac3, is proportion that are infected at T0
I3 = nearbyint(pop3*(1-sfrac3));
S4 = nearbyint(pop4*sfrac4);// pop4 is initial population size for age 4
I4 = nearbyint(pop4*(1-sfrac4));
S5 = nearbyint(pop5*sfrac5);// pop5 is initial population size for age 5
I5 = nearbyint(pop5*(1-sfrac5));
S6 = nearbyint(pop6*sfrac6);// pop6 is initial population size for age 6
I6 = nearbyint(pop6*(1-sfrac6));
W = 0;
")

si_dmeas <- Csnippet("
double lik1, lik2, lik3, lik4, lik5, lik6, lik7, lik8, Prev, Prev3, Prev4, Prev5, Prev6;

if (S3+I3+S4+I4+S5+I5+S6+I6==0)
  Prev=0;
else
  Prev = (I3+I4+I5+I6)/(S3+S4+S5+S6+I3+I4+I5+I6);

if (S3+I3==0)
  Prev3=0;
else
  Prev3 = (I3)/(S3+I3);  

if (S4+I4==0)
  Prev4=0;
else
  Prev4 = (I4)/(S4+I4);  

if (S5+I5==0)
  Prev5=0;
else
  Prev5 = (I5)/(S5+I5);  

if (S6+I6==0)
  Prev6=0;
else
  Prev6 = (I6)/(S6+I6);  

lik1 = dbinom(positive_three, samplesize_three, Prev3, give_log);
lik2 = dbinom(positive_four, samplesize_four, Prev4, give_log);
lik3 = dbinom(positive_five, samplesize_five, Prev5, give_log);
lik4 = dbinom(positive_six, samplesize_six, Prev6, give_log);
lik5 = dlnorm(logpopmean_three, log(S3+I3), logpopsd_three, give_log);
lik6 = dlnorm(logpopmean_four, log(S4+I4), logpopsd_four, give_log);
lik7 = dlnorm(logpopmean_five, log(S5+I5), logpopsd_five, give_log);
lik8 = dlnorm(logpopmean_six, log(S6+I6), logpopsd_six, give_log);

if (give_log)
  lik = lik1+lik2+lik3+lik4+lik5+lik6+lik7+lik8;
else
  lik = lik1*lik2*lik3*lik4*lik5*lik6*lik7*lik8;
")

si_rmeas <- Csnippet("
positive_three = rbinom(samplesize_three,I3/(S3+I3));
positive_four = rbinom(samplesize_four,I4/(S4+I4));
positive_five = rbinom(samplesize_five,I5/(S5+I5));
positive_six = rbinom(samplesize_six,I6/(S6+I6));
logpopmean_three = log(rlnorm(log(S3+I3), logpopsd_three));
logpopmean_four = log(rlnorm(log(S4+I4), logpopsd_four));
logpopmean_five = log(rlnorm(log(S5+I5), logpopsd_five));
logpopmean_six = log(rlnorm(log(S6+6), logpopsd_six));
")

rprocSI <- Csnippet("
// recruitment rate and timing of transition to next age group
double scale, shape, rec, Delta;
if (recruitment > 0) {
  scale = rvar/recruitment;
  shape = recruitment/scale*dt;
  rec = rgamma(shape,scale);
  Delta = 1/dt;
} else {
  rec = 0;
  Delta = 0;
}

double I, S, N;
I = I3+I4+I5+I6;
S = S3+S4+S5+S6;
N = I + S;

//extrademographic stochasticity in natural mortality mu
double dw = rgammawn (sigmaSE, dt);

double rate[3], trans[3];
rate[0] = Beta*I/N; // force of infection
rate[1] = mu*dw/dt;
rate[2] = mu*dw/dt+mud;
rate[3] = Beta*I/N; 
rate[4] = mu*dw/dt;
rate[5] = mu*dw/dt+mud;
rate[6] = Beta*I/N; 
rate[7] = mu*dw/dt;
rate[8] = mu*dw/dt+mud;
rate[9] = Beta*I/N; 
rate[10] = mu*dw/dt;
rate[11] = mu*dw/dt+mud;

reulermultinom(2,S3,&rate[0], dt, &trans[0]);
reulermultinom(1,I3,&rate[2], dt, &trans[2]);
reulermultinom(2,S4,&rate[3], dt, &trans[3]);
reulermultinom(1,I4,&rate[5], dt, &trans[5]);
reulermultinom(2,S5,&rate[6], dt, &trans[6]);
reulermultinom(1,I5,&rate[8], dt, &trans[8]);
reulermultinom(2,S6,&rate[9], dt, &trans[9]);
reulermultinom(1,I6,&rate[11], dt, &trans[11]);

S3 += nearbyint(rec*(1-pos3)) - trans[0] - trans[1] - nearbyint(Delta*S3);
I3 += nearbyint(rec*pos3)     + trans[0] - trans[2] - nearbyint(Delta*I3);
S4 += nearbyint(Delta*S3) - trans[3] - trans[4] - nearbyint(Delta*S4);
I4 += nearbyint(Delta*I3) + trans[3] - trans[5] - nearbyint(Delta*I4); 
S5 += nearbyint(Delta*S4) - trans[6] - trans[7] - nearbyint(Delta*S5);
I5 += nearbyint(Delta*I4) + trans[6] - trans[8] - nearbyint(Delta*I5);
S6 += nearbyint(Delta*S5) - trans[9] - trans[10];
I6 += nearbyint(Delta*I5) + trans[9] - trans[11];
W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
")

kingaa commented 3 years ago

Hi @mayagroner, looking over the code, the only thing that jumps out is the fact that you have not allocated enough space for the local arrays rate and trans. It looks like you need 12 elements for each: double rate[12], trans[12];. I could look into it more carefully, but would need parameter values to simulate and filter....

mayagroner commented 3 years ago

Thank you @kingaa. I made the change you suggested but I am still having the same issue. It would be helpful if you could take a more careful look. I've pasted the updated code along with the simulation starting values.

library(tidyverse)
require(readr)
require(dplyr)
library (pomp)

read_csv("PWS for pomp with age structure_recruit at three.csv") %>%
  mutate(
    month=4,
  ) -> meas

## plot mature population
meas %>% tidyr::gather("id", "value", popmean_three:popmean_six) ->meas1
meas1$id<-factor(as.factor(meas1$id), levels=c("popmean_three", "popmean_four","popmean_five", "popmean_six"))
ggplot(meas1, aes(year, value, color=id))+geom_point()+geom_line()+ scale_color_manual(values=c("red", "orange", "light green", "dark green"))+ylab("herring population")

read_csv("PWS_recruitment_allthrees.csv") %>%
  filter(year>2005) %>%
  mutate(
    rvar=recruitmentsd*recruitmentsd,
    month=1
  )%>%
  select(-recruitmentsd) %>%
  bind_rows(
    meas %>%
      select(year,month,samplesize_three,samplesize_four, samplesize_five, samplesize_six, logpopsd_three, logpopsd_four, logpopsd_five, logpopsd_six)
  ) %>%
  arrange(year,month)  -> covartable

meas %>%
  select(year,month,positive_three, positive_four, positive_five, positive_six, logpopmean_three, logpopmean_four, logpopmean_five, logpopmean_six) -> meas

covartable %>%
  select(year,recruitment,rvar) %>%
  filter(!is.na(rvar)) %>%
  mutate(
    shape=recruitment*recruitment/rvar,
    scale=rvar/recruitment
  ) %>%
  ggplot(
    aes(
      x=year,
      y=recruitment,
      ymin=qgamma(p=0.05,shape=shape,scale=scale),
      ymax=qgamma(p=0.95,shape=shape,scale=scale)
    )
  )+
  geom_ribbon(fill=gray(0.6))+
  geom_line()+
  geom_point()+xlim(2006, 2018)+theme_bw()

covartable %>%
  select(year) %>%
  distinct() %>%
  expand_grid(month=1:12) %>%
  left_join(covartable,by=c("year","month")) %>%
  mutate(
    year=year+month/12,
    recruitment=coalesce(recruitment,0),
    rvar=coalesce(rvar,0),
    samplesize_three=coalesce(samplesize_three,0),
    samplesize_four=coalesce(samplesize_four,0),
    samplesize_five=coalesce(samplesize_five,0),
    samplesize_six=coalesce(samplesize_six,0),
    logpopsd_three=coalesce(logpopsd_three,0),
    logpopsd_four=coalesce(logpopsd_four,0),
    logpopsd_five=coalesce(logpopsd_five,0),
    logpopsd_six=coalesce(logpopsd_six,0)
  ) %>%
  select(-month) %>%
  arrange(year) -> covartable

covartable<-covartable[!is.na(covartable$year),]
meas<-meas[!is.na(meas$year),]

library(pomp)

si_rinit <- Csnippet("
S3 = nearbyint(pop3*sfrac3);// pop3 is initial population size for age 3, sfrac3, is proportion that are infected at T0
I3 = nearbyint(pop3*(1-sfrac3));
S4 = nearbyint(pop4*sfrac4);// pop4 is initial population size for age 4
I4 = nearbyint(pop4*(1-sfrac4));
S5 = nearbyint(pop5*sfrac5);// pop5 is initial population size for age 5
I5 = nearbyint(pop5*(1-sfrac5));
S6 = nearbyint(pop6*sfrac6);// pop6 is initial population size for age 6
I6 = nearbyint(pop6*(1-sfrac6));
W = 0;
")

si_dmeas <- Csnippet("
double lik1, lik2, lik3, lik4, lik5, lik6, lik7, lik8, Prev, Prev3, Prev4, Prev5, Prev6;

if (S3+I3+S4+I4+S5+I5+S6+I6==0)
  Prev=0;
else
  Prev = (I3+I4+I5+I6)/(S3+S4+S5+S6+I3+I4+I5+I6);

if (S3+I3==0)
  Prev3=0;
else
  Prev3 = (I3)/(S3+I3);  

if (S4+I4==0)
  Prev4=0;
else
  Prev4 = (I4)/(S4+I4);  

if (S5+I5==0)
  Prev5=0;
else
  Prev5 = (I5)/(S5+I5);  

if (S6+I6==0)
  Prev6=0;
else
  Prev6 = (I6)/(S6+I6);  

lik1 = dbinom(positive_three, samplesize_three, Prev3, give_log);
lik2 = dbinom(positive_four, samplesize_four, Prev4, give_log);
lik3 = dbinom(positive_five, samplesize_five, Prev5, give_log);
lik4 = dbinom(positive_six, samplesize_six, Prev6, give_log);
lik5 = dlnorm(logpopmean_three, log(S3+I3), logpopsd_three, give_log);
lik6 = dlnorm(logpopmean_four, log(S4+I4), logpopsd_four, give_log);
lik7 = dlnorm(logpopmean_five, log(S5+I5), logpopsd_five, give_log);
lik8 = dlnorm(logpopmean_six, log(S6+I6), logpopsd_six, give_log);

if (give_log)
  lik = lik1+lik2+lik3+lik4+lik5+lik6+lik7+lik8;
else
  lik = lik1*lik2*lik3*lik4*lik5*lik6*lik7*lik8;
")

si_rmeas <- Csnippet("
positive_three = rbinom(samplesize_three,I3/(S3+I3));
positive_four = rbinom(samplesize_four,I4/(S4+I4));
positive_five = rbinom(samplesize_five,I5/(S5+I5));
positive_six = rbinom(samplesize_six,I6/(S6+I6));
logpopmean_three = log(rlnorm(log(S3+I3), logpopsd_three));
logpopmean_four = log(rlnorm(log(S4+I4), logpopsd_four));
logpopmean_five = log(rlnorm(log(S5+I5), logpopsd_five));
logpopmean_six = log(rlnorm(log(S6+6), logpopsd_six));
")

rprocSI <- Csnippet("
// recruitment rate and timing of transition to next age group
double scale, shape, rec, Delta;
if (recruitment > 0) {
  scale = rvar/recruitment;
  shape = recruitment/scale*dt;
  rec = rgamma(shape,scale);
  Delta = 1/dt;
} else {
  rec = 0;
  Delta = 0;
}

double I, S, N;
I = I3+I4+I5+I6;
S = S3+S4+S5+S6;
N = I + S;

//extrademographic stochasticity in natural mortality mu
double dw = rgammawn (sigmaSE, dt);

double rate[12], trans[12];
rate[0] = Beta*I/N; // force of infection
rate[1] = mu*dw/dt;
rate[2] = mu*dw/dt+mud;
rate[3] = Beta*I/N; 
rate[4] = mu*dw/dt;
rate[5] = mu*dw/dt+mud;
rate[6] = Beta*I/N; 
rate[7] = mu*dw/dt;
rate[8] = mu*dw/dt+mud;
rate[9] = Beta*I/N; 
rate[10] = mu*dw/dt;
rate[11] = mu*dw/dt+mud;

reulermultinom(2,S3,&rate[0], dt, &trans[0]);
reulermultinom(1,I3,&rate[2], dt, &trans[2]);
reulermultinom(2,S4,&rate[3], dt, &trans[3]);
reulermultinom(1,I4,&rate[5], dt, &trans[5]);
reulermultinom(2,S5,&rate[6], dt, &trans[6]);
reulermultinom(1,I5,&rate[8], dt, &trans[8]);
reulermultinom(2,S6,&rate[9], dt, &trans[9]);
reulermultinom(1,I6,&rate[11], dt, &trans[11]);

S3 += nearbyint(rec*(1-pos3)) - trans[0] - trans[1] - nearbyint(Delta*S3);
I3 += nearbyint(rec*pos3)     + trans[0] - trans[2] - nearbyint(Delta*I3);
S4 += nearbyint(Delta*S3) - trans[3] - trans[4] - nearbyint(Delta*S4);
I4 += nearbyint(Delta*I3) + trans[3] - trans[5] - nearbyint(Delta*I4); 
S5 += nearbyint(Delta*S4) - trans[6] - trans[7] - nearbyint(Delta*S5);
I5 += nearbyint(Delta*I4) + trans[6] - trans[8] - nearbyint(Delta*I5);
S6 += nearbyint(Delta*S5) - trans[9] - trans[10];
I6 += nearbyint(Delta*I5) + trans[9] - trans[11];
W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
")

measSI %>%
  simulate(
    params=c(Beta=.2, pop3=33406000, pop4=19178000, pop5=28401000, pop6=109623000, pos3=.38, mu=.4, mud=.5, sfrac3=.62, sfrac4=.62, sfrac5=.62, sfrac6=.56, sigmaSE=1),
    nsim=20,format="data.frame",include.data=TRUE
  ) -> sims

kingaa commented 3 years ago

Aren't the units of your balance equations incorrect?

S3 += nearbyint(rec*(1-pos3)) - trans[0] - trans[1] - nearbyint(Delta*S3);
I3 += nearbyint(rec*pos3)     + trans[0] - trans[2] - nearbyint(Delta*I3);
S4 += nearbyint(Delta*S3) - trans[3] - trans[4] - nearbyint(Delta*S4);
I4 += nearbyint(Delta*I3) + trans[3] - trans[5] - nearbyint(Delta*I4); 
S5 += nearbyint(Delta*S4) - trans[6] - trans[7] - nearbyint(Delta*S5);
I5 += nearbyint(Delta*I4) + trans[6] - trans[8] - nearbyint(Delta*I5);
S6 += nearbyint(Delta*S5) - trans[9] - trans[10];
I6 += nearbyint(Delta*I5) + trans[9] - trans[11];

Delta has units of 1/time, so the units in the various terms are inconsistent. I think you are trying to move everyone from one age class to another at the specified time. So Delta should be either 1 or 0 depending on the clock time.

Also, note that after the first of the above equations is executed, the value of S3 has changed, so that renders the third equation incorrect. Likewise for the second and fourth, etc.

kingaa commented 3 years ago

Also, your lognormal measurement model is incorrectly specified.

Also, you have non-integer sample sizes in your covariate table.

kingaa commented 3 years ago

Am closing this issue. Feel free to reopen if more discussion is warranted.