Closed NafisNr closed 4 years ago
It is eminently possible to fit a model of the kind you allude to.
However, I'm not sure what you are looking for. What do you mean by "defining weight for each point"?
Also, I don't understand why you specify both 'rprocess' and 'skeleton'. Since your process model is deterministic, only 'skeleton' is appropriate.
Thanks very much Dr. King for the reply. Sorry for not being clear. By weight, I mean having time varying probability in the dmeasure and rmeasure functions. In this example, I used the proportion of infected cases as the probability of success in dbinom and rbinom. Since cohort study outcome is not a representative of the population of the study area, I am wondering how I can contribute the covariates to the model? I appreciate any hint/advice.
As you mentioned, the "rprocess" for this example is redundant since it's a deterministic model. I forgot to remove it here from the code. Sorry about that. I edited the code.
dmeasure and rmeasure can depend on time and on covariates. Have you tried simply referencing the covariates in these C snippets?
Thanks for the hint.
Yes, my covar data, called "pop" here, is the number of individuals tested per time. I used it in the dmeasure and rmeasure.
I have also tried the followings as an alternative option which gives me a good fit. I defined the probability of success as the ratio of number tested per time over the maximum number tested multiplied by reporting probability (eps1):
dmeas3 <- Csnippet(" lik = dbinom(Infected,nearbyint(I),(pop/max(pop))*eps1,give_log); ")
rmeas3 <- Csnippet("
Infected = rbinom(nearbyint(I),(pop/max(pop))*eps1);
")
I am wondering if using such probabilities are correct? Thanks again.
I don't know what you mean by "correct". Perhaps it is better to ask to what model these code snippets correspond. Your meas3
snippets correspond to a model in which every infected person, independently, has a chance eps1*pop/max(pop)
of being recorded in the data as infected. Is that what you want to assume?
More generally, you really just haven't been clear in explaining (1) what your model is, (2) what your question is, and (3) what your data are.
My apologies. Hopefully it is clear this time.
I'm trying to fit an SIRS model to Campylobacter to understand the dynamics of this enteric pathogen.
Data is from a cohort study in Mirpur area, Bangladesh were in total, 265 children's stool samples were collected on a monthly scale from when they were 7 days old until they turned 2 years old. This study was conducted from 2010 to 2014. The plot below shows the number of individuals tested per time as well as number of positive Campylobacter cases.
My question is how to fit a pomp model to a cohort study data, specifically in terms of covariates? I have tried using cohort population (which is number of individuals tested per time, yellow line) as covariates in my model, but the model fit is bad. I also tried using annual Dhaka population, the city Mirpur is located in, and using smoothing spline to have a monthly scale population, but again the model fit is bad. The last thing I tried is fixing the population to Mirpur area population, which I have added the code above.
Any advice is appreciated.
What is your scientific question? Are you trying to learn about the circulation of the pathogen in the general population? Are you wanting to know the trajectories of the prevalence or perhaps to infer the relationship between prevalence and force of infection?
Then there are questions about the relationship between the data and the question: What is the relationship between the cohort of children and the circulation of the pathogen? Are the children seen as a representative sample of the population, as a set of sentinels, as a population in themselves, or something else?
My scientific question is how malnutrition affects the dynamics of the pathogen?
In this study, the nutrition status of children was also recorded on monthly scale. As the first step, I was trying to fit the model to only the enteric pathogen data without accounting for the nutrition status of children.
The cohort study was conducted in resource-constrained areas (8 countries across 3 continents) and included both urban and rural communities within populations known to have high rates of malnutrition and enteric infections. Each site defined a catchment area where it was estimated that >200 infants (the target number of children to be enrolled per site) would be born within the enrollment period lasting 2 years. I chose one of the sites for my work.
Thank you for your time.
Interesting! So, I am guessing that you want to understand what the risk of infection was among these children, so that you can, ultimately, see whether nutritional status affected their risk, no?
If this is correct, then it is not clear that it makes sense to think about the force of infection as a dynamic process. In particular, if transmission is driven by infections within a large population, and these children are only a small part of that population, the data from the children alone will not contain very much information about the transmission dynamics.
Thank you very much for your reply. Yes, that's exactly what I want to understand. Campylobacter main mode of transmission is through consuming contaminated food or water, or raw undercooked poultry. However as you mentioned, since the population of catchment area for this site is about 100,000, data from 200 children may not give me enough information.
Thanks again for your time and your help.
Yes, so it seems that pomp is more than you really want. Why not attempt to estimate the force of infection (either as a constant, or time-varying), then look at the differential risk of infection with nutrition status. Perhaps a Cox proportional hazards model would be what you want. In any case, it seems pomp is overkill in this situation.
I'm trying to fit a deterministic pomp model to a cohort study data, where individuals were kept track of on a monthly scale.
Plot below shows the number of individuals tested on a monthly scale and also number infected with an enteric pathogen.
I am wondering if it is possible to fit a pomp model to such data by defining weight for each point in the measurement model?
I appreciate your advice. Here is my R code:
rinit <- Csnippet(" double m = N/(S_0+I_0+R_0); S = nearbyint(m S_0); I = nearbyint(m I_0); R = nearbyint(m * R_0); ")
dmeas <- Csnippet(" lik = dbinom(Infected , I , Infec/pop, give_log); ")
rmeas <- Csnippet(" Infected = rbinom(I , Infec/pop); ")
toEst <- Csnippet(" T_mu = log(mu); T_iota = log(iota); T_gamma = log(gamma); T_eta = logit(eta); T_R0= log(R0); T_S_0 = logit(S_0); T_I_0 = logit(I_0); T_R_0 = logit(R_0); ")
fromEst <- Csnippet(" mu = exp(T_mu); iota = exp(T_iota); gamma = exp(T_gamma); eta = expit(T_eta); R0 = exp(T_R0); S_0 = expit(T_S_0); I_0 = expit(T_I_0); R_0 = expit(T_R_0); ")
--------------- A continuous-time deterministic skeleton ---------------
skel <- Csnippet(" double beta, lambda;
beta = R0 (mu+gamma); lambda = (betaI/N)+iota;
----pomp-construction---------------------------------------------------
data %>% pomp(t0=with(data[-1,],2*time[1]-time[2]), time="time", partrans=parameter_trans(toEst,fromEst), dmeasure=dmeas, rmeasure=rmeas, rinit=rinit, covar=covariate_table(covar,times="time"), skeleton=vectorfield(skel), statenames=c("S","I","R"), paramnames=c("gamma","R0","eta","S_0","I_0","R_0","iota","mu","N") ) -> Mln