university-of-newcastle-research / pmwg

Repository for code for the Samplers Team at UoN
3 stars 3 forks source link

Error in sample.int(x, size, replace, prob) #56

Closed andreifoldes closed 2 years ago

andreifoldes commented 2 years ago

I'm trying to run the tutorial from the book "Likelihood-Free Methods for Cognitive Science" from Chapter 3 in combination with the "Signal Detection Theory analysis of lexical decision task" chapter. Managed to get through the init stage after choosing proper priors.. however when running the "burn" stage I got the following error:

Epsilon has been set to 0.5 based on number of parameters
mix has been set to c(0.5, 0.5, 0) based on the stage being run
Phase 1: Burn in
 |==                                                    |   4% | New( 26%)
Error in sample.int(x, size, replace, prob) : NA in probability vector

I got this error in the init stage as well when I used the default priors in the tutorial, there it was easier to debug and the cause seemed to be that that all the likelihoods were -Inf

To Reproduce

initMinerva=function(N,p)sample(c(-1,0,1),N,replace=T,prob=p)

minerva=function(L,crit,decay,n.features,alpha,p,n.study,n.targets,n.test){
                 study.feat=initMinerva(n.features*n.study,p) # features of study list
                 study=matrix(study.feat,n.features,n.study) # study list
                 image.feat=rbinom(n.features*n.study,1,L) # feature of image
                 image=matrix(image.feat,n.features,n.study)*study # image
                 image=rbinom(n.features*n.study,1,1-decay)*image # decay
                 targets=study[,1:n.targets] # use first study items (arbitrary
                 # calculate number of distractors, and generate them
                 n.distractors=(n.test-n.targets) # how many distractors?
                 dist.feat=initMinerva(n.features*n.distractors,p) # features
                 distractors=matrix(dist.feat,n.features,n.distractors) # set
                 test=cbind(targets,distractors) # create test set
                 S=matrix(NA,n.study,n.test) # create similarity matrix
                 for(i in 1:n.test){ # loop over test items
                   for(j in 1:n.study){ # ...and study items
                     calc.n=sum(test[,i]!=0 | image[,j]!=0) # nonzero features
                     S[j,i]=sum(test[,i]*image[,j])/calc.n # similarity18 }
                   }
                 }
                 A=S^alpha # calculate activation
                 out=apply(A,2,sum) # calculate intensity
                 as.numeric(out>crit) # compare to criterion
}

log.dens.like=function(x,data){
  L=x[1]; crit=x[2]; feat=x[3]; # redeclare parameters
  feat=round(feat) # round the number of features
  if(L<=1 & L>=0 & feat>=2){ # test parameter boundaries
    ### insert specific approximation method here
    rep=1000
    mach=matrix(NA,rep,n.test) # declare a matrix
    for(i in 1:rep){ # simulate ’rep’ times (i.e., rep=1000 here)
      # simulate data from the model with the following settings:
      # unknown parameters: L, crit, feat
      # known parameters: decay, t.alpha, p
      # known experimental variables: n.study, n.targets, n.test
      mach[i,]=minerva(L,crit,decay=0,feat,3,p=c(1,1,1)*1/3,n.study,n.targets,n.test)
      }
    # calculate the hit and false alarm rates for each simulation
    mach.hr=apply(mach[,1:n.targets],1,sum)/n.targets
    mach.fa=apply(mach[,(n.targets+1):n.test],1,sum)/(n.test-n.targets)
    S = nrow(data)
    pdf=numeric(S) # declare a storage object
    for(j in 1:S){ # loop over subjects
      # determine the joint probability of obtaining each observed
      # data point, given the distribution of simulated data
      pdf[j]=mean(mach.hr==data$hr[j] & mach.fa==data$fa[j])
      # print(pdf)
      if(is.na(pdf[j])==T)pdf[j]=0 # test to ensure no NA values
    }
    out=sum(log(unlist(pdf))) # sum up the log likelihood values
    ### producing a variable called ’out’
    if(is.na(out))out=-Inf # test for plausibility
  } else { # if boundary test fails...
    out=-Inf # reject the proposal
  }
  out # return the final log likelihood value
}

#library(tidyverse)

n.study=20
n.targets=20 
n.test=40
fab_data = data.frame(subject=c(1,2,3,4),hr=c(0.55, 0.75, 0.75, 0.60),
                  fa=c(0.05, 0, 0, 0))
#fab_data <- fab_data %>% mutate(subject = row_number())

pars = c(L=0.5,Crit=0.1,Feat=30)

log.dens.like(pars,fab_data)

## introducing the sampler

library(pmwg)

priors <- list(
  theta_mu_mean = rep(0, length(pars)),
  theta_mu_var = diag(rep(1, length(pars)))
)

sampler <- pmwgs(data = fab_data,
                 pars = pars,
                 prior = priors,
                 ll_func = log.dens.like)

start_points <- list(mu = rep(0, length.out = length(pars)),
                     sig2 = diag(rep(.01, length(pars))))

sampler <- pmwg::init(sampler, start_points$mu,start_points$sig2)

burned <- run_stage(sampler, 
                    stage = "burn",
                    iter = 1000,
                    particles = 20,
                    display_progress = TRUE)

adapted <- run_stage(burned,
                     stage = "adapt",
                     iter = 1000,
                     particles = 20)

Expected behavior Find settings to get through all stages of fitting.

Desktop (please complete the following information):

gjcooper commented 2 years ago

Hi,

Thanks for your interest in PMwG. I'm hoping to look into this over the next few days. I have spoken to a collaborator about this issue and we have some avenues to explore - related to the likelihood free methods as implemented in that book. Will let you know soon how it goes.

gjcooper commented 2 years ago

I've looked through your code and determined a couple of fixes that seem to alleviate the problem.

As you mention, choosing the appropriate priors is important. It looks in particular like the Feat parameter should not have a prior centred around 0. I used 30 (as in your test call to log.dens.like and that helped.)

priors <- list(
  theta_mu_mean = c(0, 0, 30),
  theta_mu_var = diag(rep(1, length(pars)))
)

Additionally the start points used for initialising the sampler should also not be centred around 0 for the Feat parameter. Ideally if left out the start points should be sampled from the prior, however there is a bug in the code for PMwG that I am working on a fix for that means this is not the case. (See Issue #58)

For your use case it looks like setting the start points manually to be sampled from the prior helps initialisation. This would look like this:

start_points <- list(mu = rnorm(n = 3,
                                mean = priors$theta_mu_mean,
                                sd=diag(priors$theta_mu_var)),
                     sig2 = diag(rep(.01, length(pars))))

(An additional note, when creating the sampler you should specify the pars argument as a vector of parameter names, not a named vector of values, as in the following section. This gives the appropriate dimensions of the samples the correct labels).

sampler <- pmwgs(data = fab_data,
                 pars = c('L', 'Crit', 'Feat'),
                 prior = priors,
                 ll_func = log.dens.like)

At this point there are still some issues with initialisation in my tests. From what I can see the problem arises because of the values returned from the log likelihood function when implausible values or parameter values outside the boundary are detected. That is the returned -Inf values are causing Nan's to be generated occasionally. Specifically if all subjects returned log-likelihoods are -Inf a line in the init function (weight <- exp(lw - max(lw))) turns them into a vector of NaN's. A similar step happens in sampling that could cause an error.

I would propose the following changes. In the end of your log.dens.like function, instead of returning -Inf values, return the log of a very small likelihood. That is instead of these lines:

    ...
    out=sum(log(unlist(pdf))) # sum up the log likelihood values
    ### producing a variable called ’out’
    if(is.na(out))out=-Inf # test for plausibility
  } else { # if boundary test fails...
    out=-Inf # reject the proposal
  }
  out # return the final log likelihood value
}

I would change this to:

```r
    out <- unlist(pdf)
    bad <- (out < 1e-10) | (!is.finite(out))
    out[bad] <- 1e-10
    out <- sum(log(out))
  } else {
    out <- log(1e-10)
  }

So set likelihoods to be a minimum of 1e-10, then sum the logs of the likelihoods. This seems to get the sampler running - I successfully ran the burn in stage through 100 iterations with these changes.