Closed andreifoldes closed 2 years ago
Hi,
Thanks for your interest in PMwG. I'm hoping to look into this over the next few days. I have spoken to a collaborator about this issue and we have some avenues to explore - related to the likelihood free methods as implemented in that book. Will let you know soon how it goes.
I've looked through your code and determined a couple of fixes that seem to alleviate the problem.
As you mention, choosing the appropriate priors is important. It looks in particular like the Feat
parameter should not have a prior centred around 0. I used 30 (as in your test call to log.dens.like
and that helped.)
priors <- list(
theta_mu_mean = c(0, 0, 30),
theta_mu_var = diag(rep(1, length(pars)))
)
Additionally the start points used for initialising the sampler should also not be centred around 0 for the Feat
parameter. Ideally if left out the start points should be sampled from the prior, however there is a bug in the code for PMwG that I am working on a fix for that means this is not the case. (See Issue #58)
For your use case it looks like setting the start points manually to be sampled from the prior helps initialisation. This would look like this:
start_points <- list(mu = rnorm(n = 3,
mean = priors$theta_mu_mean,
sd=diag(priors$theta_mu_var)),
sig2 = diag(rep(.01, length(pars))))
(An additional note, when creating the sampler you should specify the pars argument as a vector of parameter names, not a named vector of values, as in the following section. This gives the appropriate dimensions of the samples the correct labels).
sampler <- pmwgs(data = fab_data,
pars = c('L', 'Crit', 'Feat'),
prior = priors,
ll_func = log.dens.like)
At this point there are still some issues with initialisation in my tests. From what I can see the problem arises because of the values returned from the log likelihood function when implausible values or parameter values outside the boundary are detected. That is the returned -Inf
values are causing Nan's to be generated occasionally. Specifically if all subjects returned log-likelihoods are -Inf
a line in the init function (weight <- exp(lw - max(lw))
) turns them into a vector of NaN's. A similar step happens in sampling that could cause an error.
I would propose the following changes. In the end of your log.dens.like
function, instead of returning -Inf values, return the log of a very small likelihood. That is instead of these lines:
...
out=sum(log(unlist(pdf))) # sum up the log likelihood values
### producing a variable called ’out’
if(is.na(out))out=-Inf # test for plausibility
} else { # if boundary test fails...
out=-Inf # reject the proposal
}
out # return the final log likelihood value
}
I would change this to:
```r
out <- unlist(pdf)
bad <- (out < 1e-10) | (!is.finite(out))
out[bad] <- 1e-10
out <- sum(log(out))
} else {
out <- log(1e-10)
}
So set likelihoods to be a minimum of 1e-10, then sum the logs of the likelihoods. This seems to get the sampler running - I successfully ran the burn in stage through 100 iterations with these changes.
I'm trying to run the tutorial from the book "Likelihood-Free Methods for Cognitive Science" from Chapter 3 in combination with the "Signal Detection Theory analysis of lexical decision task" chapter. Managed to get through the init stage after choosing proper priors.. however when running the "burn" stage I got the following error:
I got this error in the init stage as well when I used the default priors in the tutorial, there it was easier to debug and the cause seemed to be that that all the likelihoods were -Inf
To Reproduce
Expected behavior Find settings to get through all stages of fitting.
Desktop (please complete the following information):