audrey-b commented 4 years ago

devtools::install_github("poissonconsulting/nlist")

library(nlist) library(simanalyse) library(coda)

According to the JAGS manual: "If initial values are not supplied by the user,

then each parameter chooses its own initial value based on the values of its parents.

The initial value is chosen to be a “typical value” from the prior distribution.

The exact meaning of “typical value” depends on the distribution of the stochastic node,

but is usually the mean, median, or mode. If you rely on automatic initial value generation

and are running multiple parallel chains, then the initial values will be the same in all chains.

You are advised to set the starting values manually."

Thus we need to find a way to simulate initial values that will be overdispersed so that the Gelman-Rubin

diagnostic works correctly.

For most models, it makes sense to simply use sims::sims_simulate

By default, simanalyse::sma_analyse_bayesian could set inits=NA (a user wanting to

use JAGS's default could set inits=NULL) which would use sims::sims_simulate, like so

set.seed(10)

code="for(i in 1:n){ Y[i] ~ dbern(phi) } " prior ="phi ~ dbeta(0.5, 0.5)"

data <- sims::sims_simulate(code, constants = nlist("n"=5), parameters = nlist("phi" = 0.5))[[1]]

simulate_inits <- function(){

inits <- sims::sims_simulate(paste(code, prior, sep=" \n "), parameters = nlist(n = data$n), nsims=3, #that would be n.chains latent=TRUE, stochastic=TRUE)

return(lapply(inits, unclass)) }

out <- simanalyse::sma_analyse_bayesian(data, code, prior, monitor=c("phi"), mode=sma_set_mode("report", n.chains = 3), inits = simulate_inits())

Now let's look at an occupancy model where our little solution does not work

model from

https://datacloning.org/courses/2016/madison/occupancy.pdf

set.seed(10)

code="for(i in 1:n){ #n locations Y[i] ~ dbern(phi) #whether at location i (latent, unknown) W[i] ~ dbern(Y[i] * p) #whether observed at location i (data) } " prior ="p ~ dbeta(1, 1) #probability of detection phi ~ dbeta(0.5, 0.5) #probability of presence "

data <- sims::sims_simulate(code, constants = nlist("n"=5), parameters = nlist("p"=0.5, "phi" = 0.5))[[1]]

out <- simanalyse::sma_analyse_bayesian(data, code, prior, monitor=c("p", "phi"))

JAGS is not able to simulate initial values that are compatible with the data because Y[i] has to be 1 if W[i] is 1

Let's try with our function

out <- simanalyse::sma_analyse_bayesian(data, code, prior, monitor=c("p", "phi"), inits = simulate_inits())

Our function does not solve the problem because again it does not ensure Y[i]=1 if W[i] is 1

We can simulate initial values that are compatible with the data using sims_simulate in the following way

string <- "for(i in 1:n){ #n locations Y[i] ~ dbin(phi,1)T(W[i],) #whether at location i (latent, unknown) } " inits <- sims::sims_simulate(paste(string, prior, sep=" \n "), parameters = data, nsims=3, latent=NA, stochastic=NA)

out <- simanalyse::sma_analyse_bayesian(data, code, prior, monitor=c("p", "phi"), mode=sma_set_mode("report", n.chains = 3), inits = lapply(inits, unclass))

Now it works!

Thus, we could set up the inits= argument so that it can take the following values:

NA : this is the default which simulates inits automatically using the JAGS code through sims_simulate

a string: the string is used as JAGS/R code to simulate the initial values (like in the last example, to satisfy the constraints)

NULL : uses the JAGS default

a function or list of lists, like in rjags

audrey-b commented 4 years ago

another option would be one passes a list of manual inits and an R function that examines the generated values and fixes any inconsistencies ie the user gets to modify the initial values before they are used in the analysis.

audrey-b commented 4 years ago

list(inits, inits, inits) at least makes the chains different (quick fix?)

audrey-b / simanalyse

inits not random by default #11

According to the JAGS manual: "If initial values are not supplied by the user,

then each parameter chooses its own initial value based on the values of its parents.

The initial value is chosen to be a “typical value” from the prior distribution.

The exact meaning of “typical value” depends on the distribution of the stochastic node,

but is usually the mean, median, or mode. If you rely on automatic initial value generation

and are running multiple parallel chains, then the initial values will be the same in all chains.

You are advised to set the starting values manually."

Thus we need to find a way to simulate initial values that will be overdispersed so that the Gelman-Rubin

diagnostic works correctly.

For most models, it makes sense to simply use sims::sims_simulate

By default, simanalyse::sma_analyse_bayesian could set inits=NA (a user wanting to

use JAGS's default could set inits=NULL) which would use sims::sims_simulate, like so

Now let's look at an occupancy model where our little solution does not work

model from

https://datacloning.org/courses/2016/madison/occupancy.pdf

JAGS is not able to simulate initial values that are compatible with the data because Y[i] has to be 1 if W[i] is 1

Let's try with our function

Our function does not solve the problem because again it does not ensure Y[i]=1 if W[i] is 1

We can simulate initial values that are compatible with the data using sims_simulate in the following way

Now it works!

Thus, we could set up the inits= argument so that it can take the following values:

NA : this is the default which simulates inits automatically using the JAGS code through sims_simulate

a string: the string is used as JAGS/R code to simulate the initial values (like in the last example, to satisfy the constraints)

NULL : uses the JAGS default

a function or list of lists, like in rjags