florianhartig / BayesianTools

General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics
https://cran.r-project.org/web/packages/BayesianTools/index.html
119 stars 29 forks source link

Numeric instability in the Twalk #67

Open florianhartig opened 7 years ago

florianhartig commented 7 years ago

There is a numeric problem in the T-walk. For the simple test densities this did not occur, but for the VSEM, if I run the following example, the algorithm jumps sometimes into areas of the posterior space that are practically impossible in terms of the log posterior differences.

I suspect that there is somewhere some under/overflow issue in the jump probabilities, maybe in the g function, but I didn't see an obvious problem so far.

library(BayesianTools)

PAR <- VSEMcreatePAR(1:1000)
refPars <- VSEMgetDefaults()
refPars[12,] <- c(2, 0.1, 4) 
rownames(refPars)[12] <- "error-sd"
head(refPars)
referenceData <- VSEM(refPars$best[1:11], PAR) 
referenceData[,1] = 1000 * referenceData[,1] 

obs <- referenceData + rnorm(length(referenceData), sd = refPars$best[12])

parSel = c(1:6, 12)
likelihood <- function(par, sum = TRUE){
  x = refPars$best
  x[parSel] = par
  predicted <- VSEM(x[1:11], PAR) # replace here VSEM with your model 
  predicted[,1] = 1000 * predicted[,1] # this is just rescaling
  diff <- c(predicted[,1:4] - obs[,1:4]) # difference betweeno observed and 
  llValues <- dnorm(diff, sd = x[12], log = TRUE)  
  if (sum == FALSE) return(llValues)
  else return(sum(llValues))
}

bayesianSetup <- createBayesianSetup(likelihood, lower = refPars$lower[parSel],                 upper = refPars$upper[parSel], names = rownames(refPars)[parSel])

settings <- list(iterations = 200000, nrChains = 1)

out <- runMCMC(bayesianSetup = bayesianSetup, sampler = "Twalk", settings = settings)

plot(out, parametersOnly = F, which = c(1:7,9), start = 2000)

x = getSample(out, parametersOnly = F, which = c(1:7,9), start = 2000)

low = which(x[,8] < -9000)

apply(x[low,],2,mean)

likelihood(x[low[1],1:7])
x[low[1],8]
florianhartig commented 6 years ago

This seems to be more complicated, added a warning in https://github.com/florianhartig/BayesianTools/commit/1bb636af652129ce7276c2fb244b059fdddf7abf

MaximilianPi commented 6 years ago

I checked the Rtwalk package and the same problem occurs there, maybe it would be better/faster to rewrite the whole sampler

florianhartig commented 6 years ago

is our code comparable in structure to this package? Because when I remember correctly, @stefan-paul had implemented it differently than the code from the publication ...

MaximilianPi commented 6 years ago

No, it's similar implemented...

florianhartig commented 5 years ago

A colleague told me that he had the feeling the https://cran.r-project.org/web/packages/Rtwalk/index.html package works - could you try out if the sampler in the T-walk package works where ours fails?

florianhartig commented 5 years ago

Sorry, I just saw that Max had already tried this - but maybe try if this is still so. Also, we should maybe tell the guys that they have a problem in their package!

florianhartig commented 5 years ago

Did anyone check the sampler in the original paper? If I remember correctly, there was code included with the paper!

florianhartig commented 5 years ago

@JohannesOberpriller - when I looked at the code, I was relatively sure the problem is in the Gfun (if I remember correctly, or one of the helper functions) - the weird jumps occur when one of these functions creates very high values. Didn't look like an overflow though, seemed the values were realistic!