Closed kschulzke closed 8 years ago
Hi jedijd,
Sorry, I'm not going to plough thro all the code you post to look for your problem. I hope you didn't expect me to.
You appear to be sourcing code from John Kruschke's web site. The code on my github repo is NOT the same and mixing the two will yield grief.
If you want to use my code on github, please just install the BEST package from CRAN and use that. Do check the help pages, if you do ?plotAll, you'll see that the first argument should be the output from BESTmcmc. Your line postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) is wrong.
HTH, Mike
On 15.11.15 05:49, jedijd wrote:
Mike,
My compilation of the various functions in your BEST build, MODIFIED 6/29/15 MM BestMCMC w/parallel processing, appears below. Below that is my own BEST script. When I run these under the latest R build, it crashes, right after completing BESTmcmc, with this message:
postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) Error in plotAll(y1, y2, mcmcChain, ROPEeff = c(-0.1, 0.1)) : BESTobj is not a valid BEST object
John K. Kruschke & Mike Meredith
johnkruschke@gmail.com mailto:johnkruschke@gmail.com
http://www.indiana.edu/~kruschke/BEST/ http://www.indiana.edu/%7Ekruschke/BEST/
#
This program is believed to be free of errors, but it comes with no guarantee!
The user bears all responsibility for interpreting the results.
Please check the webpage above for updates or corrections.
#
*************************************************************** ******** SEE FILE BESTgammaExample.R FOR INSTRUCTIONS ************** ***************************************************************
source("openGraphSaveGraph.R") # graphics functions for Windows, Mac OS, Linux
=================================================================================
Function for shape and rate parameters of gamma. From DBDA2E-utilities.R; see
p. 238 of "Doing Bayesian Data Analysis" Second Edition,
https://sites.google.com/site/doingbayesiandataanalysis/
Modified by MM to accept mode/mean and sd as vectors
Not exported.
gammaShRaFromModeSD = function( mode , sd ) {
if ( mode <=0 ) stop("mode must be > 0")
if ( sd <=0 ) stop("sd must be > 0")
if ( any(mode <= 0) ) stop("mode of gamma prior must be > 0") if ( any(sd <= 0) ) stop("sd of gamma prior must be > 0") rate = ( mode + sqrt( mode^2 + 4 * sd^2 ) ) / ( 2 * sd^2 ) shape = 1 + mode * rate return( list( shape=shape , rate=rate ) ) }
gammaShRaFromMeanSD = function( mean , sd ) {
if ( mean <=0 ) stop("mean must be > 0")
if ( sd <=0 ) stop("sd must be > 0")
if ( any(mean <= 0) ) stop("mean of gamma prior must be > 0") if ( any(sd <= 0) ) stop("sd of gamma prior must be > 0") shape = mean^2/sd^2 rate = mean/sd^2 return( list( shape=shape , rate=rate ) ) }
Parallel processing version using jagsUI::jags.basic
BESTmcmc <- function( y1, y2=NULL, priors=NULL, doPriorsOnly=FALSE, numSavedSteps=1e5, thinSteps=1, burnInSteps = 1000, verbose=TRUE, rnd.seed=NULL, parallel=NULL) {
This function generates an MCMC sample from the posterior distribution.
y1, y2 the data vectors; y2=NULL if only one group.
priors is a list specifying priors to use.
verbose=FALSE suppresses output to the R Console.
rnd.seed is passed to each of the chains, with a different pseudorandom
number generator for each.
Returns a data frame, not a matrix, with class 'BEST',
with attributes Rhat, n.eff, a list with the original data, and the
priors.
------------------------------------------------------------------------------
|if(doPriorsOnly && verbose) cat("Warning: The output shows the prior distributions, NOT the posterior distributions for your data.\n") # Parallel processing check nCores <- detectCores() if(!is.null(parallel) && parallel && nCores < 4) { if(verbose) warning("Not enough cores for parallel processing, running chains sequentially.") parallel <- FALSE } if(is.null(parallel)) parallel <- nCores > 3 # Data checks y <- c( y1 , y2 ) # combine data into one vector if(!all(is.finite(y))) stop("The input data include NA or Inf.") if(length(unique(y)) < 2 && # sd(y) will be 0 or NA; ok if priors specified. (is.null(priors) || is.null(priors$muSD) || is.null(priors$sigmaMode) || is.null(priors$sigmaSD))) stop("If priors are not specified, data must include at least 2 (non-equal) values.")
Prior checks: if(!is.null(priors)) { if(!is.list(priors)) {
if(is.numeric(priors)) { stop("'priors' is now the 3rd argument; it must be a list (or NULL).") } else { stop("'priors' must be a list (or NULL).") } } nameOK <- names(priors) %in% c("muM", "muSD", "sigmaMode", "sigmaSD", "nuMean", "nuSD") if(!all(nameOK)) stop("Invalid items in prior specification: ", paste(sQuote(names(priors)[!nameOK]), collapse=", ")) if(!all(sapply(priors, is.numeric))) stop("All items in 'priors' must be numeric.") if(!is.null(priors$muSD) && priors$muSD <= 0) stop("muSD must be > 0") } if(is.null(rnd.seed)) rnd.seed <- floor(runif(1,1,10000)) # THE PRIORS if(is.null(priors)) { # use the old prior specification dataForJAGS <- list( muM = mean(y) , muP = 0.000001 * 1/sd(y)^2 , sigmaLow = sd(y) / 1000 , sigmaHigh = sd(y) * 1000 ) } else { # use gamma priors priors0 <- list( # default priors muM = mean(y) , muSD = sd(y)_5 , sigmaMode = sd(y), sigmaSD = sd(y)_5, nuMean = 29, nuSD = 29 ) priors0 <- modifyList(priors0, priors) # user's priors take prior-ity (duh!!) # Convert to Shape/Rate sigmaShRa <- gammaShRaFromModeSD(mode=priors0$sigmaMode, sd=priors0$sigmaSD) nuShRa <- gammaShRaFromMeanSD(mean=priors0$nuMean, sd=priors0$nuSD) dataForJAGS <- list( muM = priors0$muM, muP = 1/priors0$muSD^2, # convert SD to precision Sh = sigmaShRa$shape, Ra = sigmaShRa$rate) if(!is.null(y2)) { # all the above must be vectors of length 2 fixPrior <- function(x) { if(length(x) < 2) x <- rep(x, 2) return(x) } dataForJAGS <- lapply(dataForJAGS, fixPrior) } dataForJAGS$ShNu <- nuShRa$shape dataForJAGS$RaNu <- nuShRa$rate } # THE MODEL. modelFile <- file.path(tempdir(), "BESTmodel.txt") if(is.null(priors)) { # use old broad priors if(is.null(y2)) { modelString = " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu , tau , nu ) } mu ~ dnorm( muM , muP ) tau <- 1/pow( sigma , 2 ) sigma ~ dunif( sigmaLow , sigmaHigh ) nu <- nuMinusOne+1 nuMinusOne ~ dexp(1/29) } " # close quote for modelString } else { modelString <- " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu[x[i]] , tau[x[i]] , nu ) } for ( j in 1:2 ) { mu[j] ~ dnorm( muM , muP ) tau[j] <- 1/pow( sigma[j] , 2 ) sigma[j] ~ dunif( sigmaLow , sigmaHigh ) } nu <- nuMinusOne+1 nuMinusOne ~ dexp(1/29) } " # close quote for modelString } } else { # use gamma priors if(is.null(y2)) { modelString = " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu , tau , nu ) } mu ~ dnorm( muM[1] , muP[1] ) tau <- 1/pow( sigma , 2 ) sigma ~ dgamma( Sh[1] , Ra[1] ) nu <- nuMinusOne+1 nuMinusOne ~ dgamma( ShNu , RaNu ) # prior for nu } " # close quote for modelString } else { modelString <- " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu[x[i]] , tau[x[i]] , nu ) } for ( j in 1:2 ) { mu[j] ~ dnorm( muM[j] , muP[j] ) tau[j] <- 1/pow( sigma[j] , 2 ) sigma[j] ~ dgamma( Sh[j] , Ra[j] ) } nu <- nuMinusOne+1 nuMinusOne ~ dgamma( ShNu , RaNu ) # prior for nu } " # close quote for modelString } } # Write out modelString to a text file writeLines( modelString , con=modelFile )
------------------------------------------------------------------------------
THE DATA. # dataForJAGS already has the priors, add the data:
if(!doPriorsOnly) dataForJAGS$y <- y dataForJAGS$Ntotal <- length(y) if(!is.null(y2)) # create group membership code dataForJAGS$x <- c( rep(1,length(y1)) , rep(2,length(y2)) )
------------------------------------------------------------------------------
INTIALIZE THE CHAINS. # Initial values of MCMC chains based on data:
if(is.null(y2)) { mu = mean(y1) sigma = sd(y1) } else { mu = c( mean(y1) , mean(y2) ) sigma = c( sd(y1) , sd(y2) ) } # Regarding initial values in next line: (1) sigma will tend to be too big if # the data have outliers, and (2) nu starts at 5 as a moderate value. These # initial values keep the burn-in period moderate. initsList0 <- list(mu=mu, sigma=sigma, nuMinusOne=4, .RNG.seed=rnd.seed) initsList <- list( c(initsList0, .RNG.name="base::Wichmann-Hill"), c(initsList0, .RNG.name="base::Marsaglia-Multicarry"), c(initsList0, .RNG.name="base::Super-Duper") )
------------------------------------------------------------------------------
RUN THE CHAINS codaSamples <- jags.basic( data = dataForJAGS, inits
= initsList, parameters.to.save = c( "mu" , "sigma" , "nu" ), # The parameters to be monitored model.file = modelFile, n.chains = 3, # Do not change this without also changing inits. n.adapt = 500, n.iter = ceiling( ( numSavedSteps * thinSteps) / 3 + burnInSteps ), n.burnin = burnInSteps, n.thin = thinSteps, modules = NULL, parallel = parallel, DIC = FALSE, seed = rnd.seed, verbose = verbose)
------------------------------------------------------------------------------
mcmcChain = as.matrix( codaSamples ) if(dim(mcmcChain)[2] == 5 && all(colnames(mcmcChain) == c("mu[1]", "mu[2]", "nu", "sigma[1]", "sigma[2]"))) colnames(mcmcChain) <- c("mu1", "mu2", "nu", "sigma1", "sigma2") mcmcDF <- as.data.frame(mcmcChain) class(mcmcDF) <- c("BEST", class(mcmcDF)) attr(mcmcDF, "call") <- match.call() attr(mcmcDF, "Rhat") <- gelman.diag(codaSamples)$psrf[, 1] attr(mcmcDF, "n.eff") <- effectiveSize(codaSamples) attr(mcmcDF, "data") <- list(y1 = y1, y2 = y2) attr(mcmcDF, "doPriorsOnly") <- doPriorsOnly if(!is.null(priors)) attr(mcmcDF, "priors") <- priors0 return( mcmcDF ) |
} # end function BESTmcmc
==============================================================================
summary.BEST <- function(object, credMass=0.95, ROPEm=NULL, ROPEsd=NULL, ROPEeff=NULL, compValm=0, compValsd=NULL, compValeff=0, ...) {
Produces summary stats for a BEST object.
Should do the same set of stats as plotAll and use the same syntax
as far as possible.
|# Sanity checks: if(!inherits(object, "data.frame")) stop("object is not a valid BEST object") #mcmcChain <- as.matrix(object) if(ncol(object) == 3) { oneGrp <- TRUE nparam <- 5 } else { if(ncol(object) != 5) stop("object is not a valid BEST object.") oneGrp <- FALSE nparam <- 9 } # Define matrix for storing summary info: summaryInfo = matrix(NA, nrow=nparam, ncol=11) if(oneGrp) { rownames(summaryInfo) <- c("mu", "sigma", "nu", "log10nu", "effSz") } else { rownames(summaryInfo) <- c("mu1", "mu2", "muDiff", "sigma1", "sigma2", "sigmaDiff", "nu", "log10nu", "effSz") } colnames(summaryInfo) <- c("mean","median","mode", "HDI%","HDIlo","HDIup", "compVal","%>compVal", "ROPElow","ROPEhigh","%InROPE") if(oneGrp) { # Deal with 1-group case: summaryInfo["mu", ] <- sumPost(object$mu, credMass=credMass, compVal=compValm, ROPE=ROPEm) summaryInfo["sigma", ] = sumPost(object$sigma, credMass=credMass, compVal=compValsd, ROPE=ROPEsd) mu0 <- if(is.null(compValm)) 0 else compValm effectSize <- (object$mu - mu0) / object$sigma summaryInfo["effSz", ] = sumPost(effectSize, credMass=credMass, compVal=compValeff, ROPE=ROPEeff) } else { summaryInfo["mu1", ] <- sumPost(object$mu1, credMass=credMass) summaryInfo["mu2", ] <- sumPost(object$mu2, credMass=credMass) summaryInfo["muDiff", ] <- sumPost(object$mu1 - object$mu2, credMass=credMass, compVal=compValm, ROPE=ROPEm) summaryInfo["sigma1", ] <- sumPost(object$sigma1, credMass=credMass) summaryInfo["sigma2", ] <- sumPost(object$sigma2, credMass=credMass) if(is.null(compValsd)) compValsd <- 0 summaryInfo["sigmaDiff", ] <- sumPost(object$sigma1 - object$sigma2, credMass=credMass, compVal=compValsd, ROPE=ROPEsd) effSzChain = ((object$mu1 - object$mu2) / sqrt((object$sigma1^2 + object$sigma2^2) / 2)) summaryInfo["effSz", ] = sumPost(effSzChain, credMass=credMass, compVal=compValeff, ROPE=ROPEeff) # This does not use sample-size weighted version of effect size: # N1 = length(y1) # N2 = length(y2) # effSz = (mu1 - mu2) / sqrt((sigma1^2 (N1-1) + sigma2^2 (N2-1)) # / (N1+N2-2)) } # Deal with nu: summaryInfo["nu", ] = sumPost(object$nu, credMass=credMass) summaryInfo["log10nu", ] = sumPost(log10(object$nu), credMass=credMass) class(summaryInfo) <- c("summary.BEST", class(summaryInfo)) return(summaryInfo) |
}
print.summary.BEST <- function(x, digits=3, ...) {
print method for summary.BEST
Remove all-NA columns:
ok <- apply(x, 2, function(y) !all(is.na(y))) class(x) <- NULL print.default(x[,ok], digits=digits, na.print="", ...) }
===========================
pairs.BEST <- function(x, nPtToPlot = 1000, col="skyblue", ...) {
Plot the parameters pairwise, to see correlations
|# Sanity checks: if(!inherits(x, "data.frame")) stop("x is not a valid BEST object") if(ncol(x) == 3 && all(colnames(x) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(x) == 5 && all(colnames(x) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("x is not a valid BEST object") } nuCol <- which(colnames(x) == "nu") mcmcChain <- cbind(x[, -nuCol], log10(x$nu)) #plotIdx = floor(seq(1, nrow(mcmcChain),by=nrow(mcmcChain)/nPtToPlot)) #TODO Use length.out plotIdx = floor(seq(1, nrow(mcmcChain), length.out=nPtToPlot)) #TODO Use length.out panel.cor = function(x, y, digits=2, prefix="", cex.cor, ...) { usr = par("usr"); on.exit(par(usr)) par(usr = c(0, 1, 0, 1)) r = (cor(x, y)) txt = format(c(r, 0.123456789), digits=digits)[1] txt = paste(prefix, txt, sep="") if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt) # cex.cor is now cruft? text(0.5, 0.5, txt, cex=1.25 ) # was cex=cex.cor*r } if(oneGrp) { labels <- c( expression(mu), expression(sigma), expression(log10(nu)) ) } else { labels <- c( expression(mu[1]) , expression(mu[2]) , expression(sigma[1]) , expression(sigma[2]) , expression(log10(nu)) ) } pairs( mcmcChain[plotIdx, ] , labels=labels, lower.panel=panel.cor, col=col, ... ) |
}
===Plot Posteriors
plotAll <- function(BESTobj, credMass=0.95, ROPEm=NULL, ROPEsd=NULL, ROPEeff=NULL, compValm=0, compValsd=NULL, compValeff=0, showCurve=FALSE, ...) {
This function plots the posterior distribution (and data). It does not
produce a scatterplot matrix; use pairs(...) for that.
Description of arguments:
BESTobj is BEST object of the type returned by function BESTmcmc.
ROPEm is a two element vector, such as c(-1,1), specifying the limit
of the ROPE on the difference of means.
ROPEsd is a two element vector, such as c(-1,1), specifying the limit
of the ROPE on the difference of standard deviations.
ROPEeff is a two element vector, such as c(-1,1), specifying the limit
of the ROPE on the effect size.
showCurve if TRUE the posterior should be displayed as a fitted
density curve
instead of a histogram (default).
|# Sanity checks: if(!inherits(BESTobj, "data.frame")) stop("BESTobj is not a valid BEST object") if(ncol(BESTobj) == 3 && all(colnames(BESTobj) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(BESTobj) == 5 && all(colnames(BESTobj) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("BESTobj is not a valid BEST object") } y1 <- attr(BESTobj, "data")$y1 y2 <- attr(BESTobj, "data")$y2 # Select thinned steps in chain for plotting of posterior predictive curves: chainLength <- NROW( BESTobj ) nCurvesToPlot <- 30 # stepIdxVec <- seq( 1 , chainLength , floor(chainLength/nCurvesToPlot) ) stepIdxVec <- seq(1, chainLength, length.out=nCurvesToPlot) if(oneGrp) { mu1 <- BESTobj$mu sigma1 <- BESTobj$sigma y2 <- mu2 <- sigma2 <- NULL } else { mu1 <- BESTobj$mu1 mu2 <- BESTobj$mu2 sigma1 <- BESTobj$sigma1 sigma2 <- BESTobj$sigma2 } nu <- BESTobj$nu # Set up window and layout: # windows(width=6.0,height=8.0) # Better to use default plot window. oldpar <- par(mar=c(3.5,3.5,2.5,0.5), mgp=c(2.25,0.7,0), "mfrow") on.exit(par(oldpar)) if(oneGrp) { layout( matrix( c(3,3,4,4,5,5, 1,1,1,1,2,2) , nrow=6, ncol=2 , byrow=FALSE ) ) } else { layout( matrix( c(4,5,7,8,3,1,2,6,9,10) , nrow=5, byrow=FALSE ) ) } if(is.null(y1) && is.null(y2)) { xLim <- range(mu1, mu2) } else { xRange <- range(y1, y2) xLim <- c( xRange[1]-0.1_diff(xRange) , xRange[2]+0.1_diff(xRange) ) } xVec <- seq( xLim[1] , xLim[2] , length=200 ) maxY <- max( dt( 0 , df=max(nu) ) / min(sigma1,sigma2) )
Plot data and smattering of posterior predictive curves:
plotDataPPC(y=y1, mu=mu1[stepIdxVec], sigma=sigma1[stepIdxVec], nu=nu[stepIdxVec], xVec=xVec, maxY=maxY) if(oneGrp) { title(main="Data w. Post. Pred.") if(!is.null(y1)) text( max(xVec) , maxY , bquote(N ==.(length(y1))) , adj=c(1.1,1.1) ) } else { title(main="Data Group 1 w. Post. Pred.") if(!is.null(y1)) text( max(xVec) , maxY , bquote(N[1]==.(length(y1))) , adj=c(1.1,1.1) ) } if(!oneGrp) { plotDataPPC(y=y2, mu=mu2[stepIdxVec], sigma=sigma2[stepIdxVec], nu=nu[stepIdxVec], xVec=xVec, maxY=maxY) title(main="Data Group 2 w. Post. Pred.") if(!is.null(y2)) text( max(xVec) , maxY , bquote(N[2]==.(length(y2))) , adj=c(1.1,1.1) ) } # Plot posterior distribution of parameter nu: plotPost( log10(nu) , col="skyblue" , credMass=credMass, showCurve=showCurve , xlab=bquote("log10("nu")") , cex.lab = 1.75 , showMode=TRUE , main="Normality" ) # (<0.7 suggests kurtosis) # Plot posterior distribution of parameters mu1, mu2, and their difference: xlim <- range( c( mu1 , mu2 ) ) if(oneGrp) { plotPost( mu1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , ROPE=ROPEm, compVal=compValm, xlab=bquote(mu) , main="Mean" , col="skyblue" ) } else { plotPost( mu1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(mu[1]) , main=paste("Group",1,"Mean") , col="skyblue" ) plotPost( mu2 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(mu[2]) , main=paste("Group",2,"Mean") , col="skyblue" ) plotPost( mu1-mu2 , compVal=0 , showCurve=showCurve , credMass=credMass, xlab=bquote(mu[1]
- mu[2]) , cex.lab = 1.75 , ROPE=ROPEm , main="Difference of Means" , col="skyblue" ) } # Plot posterior distribution of param's sigma1, sigma2, and their difference: xlim <- range( c( sigma1 , sigma2 ) ) if(oneGrp) { plotPost(sigma1, xlim=xlim, cex.lab = 1.75, credMass=credMass, showCurve=showCurve, ROPE=ROPEsd, compVal=compValsd, xlab=bquote(sigma) , main="Std. Dev." , col="skyblue" , showMode=TRUE ) } else { plotPost( sigma1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(sigma[1]) , main=paste("Group",1,"Std. Dev.") , col="skyblue" , showMode=TRUE ) plotPost( sigma2 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(sigma[2]) , main=paste("Group",2,"Std. Dev.") , col="skyblue" , showMode=TRUE ) plotPost( sigma1-sigma2 , credMass=credMass, compVal=compValsd , showCurve=showCurve , xlab=bquote(sigma[1] - sigma[2]) , cex.lab = 1.75 , ROPE=ROPEsd , main="Difference of Std. Dev.s" , col="skyblue" , showMode=TRUE ) } # Effect size for 1 group: if(oneGrp) { effectSize = ( mu1 - compValm ) / sigma1 plotPost( effectSize , compVal=compValeff , ROPE=ROPEeff , credMass=credMass, showCurve=showCurve , xlab=bquote( (mu-.(compValm)) / sigma ), showMode=TRUE , cex.lab=1.75 , main="Effect Size" , col="skyblue" ) } else { # Plot of estimated effect size. Effect size is d-sub-a from # Macmillan & Creelman, 1991; Simpson & Fitter, 1973; Swets, 1986a, 1986b. effectSize <- ( mu1 - mu2 ) / sqrt( ( sigma1^2 + sigma2^2 ) / 2 ) plotPost( effectSize , compVal=compValeff , ROPE=ROPEeff , credMass=credMass, showCurve=showCurve , xlab=bquote( (mu[1]-mu[2]) /sqrt((sigma[1]^2 +sigma[2]^2 )/2 ) ), showMode=TRUE , cex.lab=1.0 , main="Effect Size" , col="skyblue" ) } # Or use sample-size weighted version: # Hedges 1981; Wetzels, Raaijmakers, Jakab & Wagenmakers 2009. # N1 = length(y1) # N2 = length(y2) # effectSize = ( mu1 - mu2 ) / sqrt( ( sigma1^2 (N1-1) + sigma2^2 (N2-1) ) # / (N1+N2-2) ) # Be sure also to change BESTsummary function, above. # histInfo = plotPost( effectSize , compVal=0 , ROPE=ROPEeff , # showCurve=showCurve , # xlab=bquote( (mu[1]-mu[2]) # /sqrt((sigma[1]^2 (N[1]-1)+sigma[2]^2 (N[2]-1))/(N[1]+N[2]-2)) ), # showMode=TRUE , cex.lab=1.0 , main="Effect Size" , col="skyblue" ) return(invisible(NULL)) |
}
-------------------------------------------------------------------------------
plotDataPPC <- function(y, mu, sigma, nu, xVec, maxY) {
Does the plots of posterior predictive curves for one sample; called
by plotAll;
not exported.
Does not do title or sample size: those are added later.
y = original data for this sample; can be NULL
mu, sigma, nu = vectors of parameters to use for the t-curves and
histogram breaks.
xVec = vector of values to use for the x-axis values
maxY = height of the y-axis
|plot(xVec[1], 0, xlim=range(xVec), ylim=c(0, maxY), cex.lab=1.75, type="n", xlab="y", ylab="p(y)") for ( i in seq_along(mu)) { lines(xVec, dt( (xVec-mu[i])/sigma[i], df=nu[i] )/sigma[i], col="skyblue") } if(!is.null(y)) { histBinWd <- median(sigma)/2 histCenter <- mean(mu) histBreaks <- try(sort( c( seq( histCenter-histBinWd/2 , min(xVec)-histBinWd,#/2 , -histBinWd ), seq( histCenter+histBinWd/2 , max(xVec)+histBinWd,#/2 , histBinWd ) ) ), silent=TRUE) if(inherits(histBreaks, "try-error")) { text(min(xVec), maxY, "Cannot plot data.", pos=4, col='red') } else { histInfo <- hist( y, plot=FALSE , breaks=histBreaks ) PlotMat <- cbind(histInfo$mids, histInfo$density) PlotMat[histInfo$density == 0] <- NA points( PlotMat, type="h" , lwd=3 , col="red" ) } } |
}
================================================================================
postPriorOverlap <- function( paramSampleVec, prior, ..., yaxt="n", ylab="", xlab="Parameter", main="", cex.lab=1.5, cex=1.4, xlim=range(paramSampleVec), breaks=NULL) {
|# Does a posterior histogram for a single parameter, adds the prior, # displays and calculates the overlap. # Returns the overlap. oldpar <- par(xpd=NA) ; on.exit(par(oldpar)) # get breaks: a sensible number over the hdi; cover the full range (and no more); # equal spacing. if (is.null(breaks)) { nbreaks <- ceiling(diff(range(paramSampleVec)) / as.numeric(diff(hdi(paramSampleVec))/18)) breaks <- seq(from=min(paramSampleVec), to=max(paramSampleVec), length.out=nbreaks) } # plot posterior histogram. histinfo <- hist(paramSampleVec, xlab=xlab, yaxt=yaxt, ylab=ylab, freq=FALSE, border='white', col='skyblue', xlim=xlim, main=main, cex=cex, cex.lab=cex.lab, breaks=breaks) if (is.numeric(prior)) { # plot the prior if it's numeric priorInfo <- hist(prior, breaks=c(-Inf, breaks, Inf), add=TRUE, freq=FALSE, col='yellow', border='white')$density[2:length(breaks)] } else if (is.function(prior)) { if(class(try(prior(0.5, ...), TRUE)) == "try-error") stop(paste("Incorrect arguments for the density function", substitute(prior))) priorInfo <- prior(histinfo$mids, ...) } # get (and plot) the overlap minHt <- pmin(priorInfo, histinfo$density) rect(breaks[-length(breaks)], rep(0, length(breaks)-1), breaks[-1], minHt, col='green', border='white') overlap <- sum(minHt * diff(histinfo$breaks)) # Add curve if prior is a function if (is.function(prior)) lines(histinfo$mids, priorInfo, lwd=2, col='brown') # Add text text(mean(breaks), 0, paste0("overlap = ", round(overlap*100), "%"), pos=3, cex=cex) return(overlap) |
}
===================================================================================
sumPost <- function(paramSampleVec, credMass=0.95, compVal=NULL, ROPE=NULL) {
Gets summary information for a single parameter;
called by summary.BEST; not exported.
postSummary <- rep(NA, 11) names(postSummary) <- c("mean","median","mode", "hdiMass","hdiLow","hdiHigh", "compVal","pcGTcompVal", "ROPElow","ROPEhigh","pcInROPE")
postSummary["mean"] <- mean(paramSampleVec) postSummary["median"] <- median(paramSampleVec) mcmcDensity <- density(paramSampleVec) postSummary["mode"] <- mcmcDensity$x[which.max(mcmcDensity$y)]
|HDI <- hdi(paramSampleVec, credMass) postSummary["hdiMass"] <- credMass * 100 postSummary["hdiLow"] <- HDI[1] postSummary["hdiHigh"] <- HDI[2] if (!is.null(compVal)) { postSummary["compVal"] <- compVal postSummary["pcGTcompVal"] <- mean(paramSampleVec > compVal) * 100 } if (!is.null(ROPE)) { postSummary["ROPElow"] <- ROPE[1] postSummary["ROPEhigh"] <- ROPE[2] postSummary["pcInROPE"] <- mean(paramSampleVec > ROPE[1] & paramSampleVec < ROPE[2]) * 100 } return(postSummary) |
}
======================================================================
BESTpower <- function( BESTobj, N1, N2, credMass=0.95, ROPEm, ROPEsd, ROPEeff, maxHDIWm, maxHDIWsd, maxHDIWeff, compValm=0, nRep=200, mcmcLength=10000, saveName=NULL, showFirstNrep=0, verbose=2, rnd.seed=NULL) {
This function estimates power.
|# Sanity checks: if(!inherits(BESTobj, "data.frame")) stop("BESTobj is not a valid BEST object") if(ncol(BESTobj) == 3 && all(colnames(BESTobj) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(BESTobj) == 5 && all(colnames(BESTobj) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("BESTobj is not a valid BEST object") } chainLength = NROW( BESTobj ) if(chainLength < nRep) stop(paste("BESTobj does not have enough values; needs", nRep)) if(credMass <= 0 || credMass >= 1) stop("credMass must lie between 0 and 1.") if(missing(N1)) N1 <- length(attr(BESTobj, "data")$y1) N1 <- rep(N1, length.out=nRep)
if(!oneGrp && length(N2) < nRep) if(!oneGrp) { if(missing(N2)) N2 <-
length(attr(BESTobj, "data")$y2) N2 <- rep(N2, length.out=nRep) } # Deal with missing or invalid arguments for criteria: wanted <- rep(TRUE, 12) if(missing(ROPEm) || length(ROPEm) != 2) { wanted[1:3] <- FALSE ROPEm <- c(NAreal, NAreal) } if(missing(ROPEsd) || length(ROPEsd) != 2) { wanted[5:7] <- FALSE ROPEsd <- c(NAreal, NAreal) } if(missing(ROPEeff) || length(ROPEeff) != 2) { wanted[9:11] <- FALSE ROPEeff <- c(NAreal, NAreal) } if(missing(maxHDIWm) || maxHDIWm <= 0) { wanted[4] <- FALSE maxHDIWm <- NAreal } if(missing(maxHDIWsd) || maxHDIWsd <= 0) { wanted[8] <- FALSE maxHDIWsd <- NAreal } if(missing(maxHDIWeff) || maxHDIWeff <= 0) { wanted[12] <- FALSE maxHDIWeff <- NAreal } if(!any(wanted)) stop("No valid criteria set.") # Deal with random number seeds if(!is.null(rnd.seed)) { set.seed(rnd.seed[1], "Mersenne-Twister") if(length(rnd.seed) != nRep) rnd.seed <- sample.int(1e6, nRep) on.exit(set.seed(NULL, "default")) } # Select thinned steps in chain for posterior predictions: stepIdxVec = seq( 1 , chainLength , floor(chainLength/nRep) )[1:nRep] paramDF <- BESTobj[stepIdxVec, ] goalTally <- numeric(12) power <- matrix(NA, 12, 3) colnames(power) <- c("mean", "CrIlo", "CrIhi") # "CrI", cos too many HDIs already! rownames(power) <- c( " mean: HDI > ROPE", " mean: HDI < ROPE", " mean: HDI in ROPE", " mean: HDI width ok", " sd: HDI > ROPE", " sd: HDI < ROPE", " sd: HDI in ROPE", " sd: HDI width ok", "effect: HDI > ROPE", "effect: HDI < ROPE", "effect: HDI in ROPE", "effect: HDI width ok") if(verbose == 1) { pb <- txtProgressBar(style = 3) on.exit(close(pb)) } for (i in 1:nRep) { if(verbose > 1) { cat( "\n:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::\n" ) cat( paste( "Power computation: Simulated Experiment" , i , "of" , nRep , ":\n\n" ) ) flush.console() } # Get parameter values for this simulation: if(oneGrp) { mu1Val = paramDF[i,"mu"] sigma1Val = paramDF[i,"sigma"] } else { mu1Val = paramDF[i,"mu1"] mu2Val = paramDF[i,"mu2"] sigma1Val = paramDF[i,"sigma1"] sigma2Val = paramDF[i,"sigma2"] } nuVal = paramDF[i,"nu"] # Generate simulated data: y1 <- rt(N1[i], df=nuVal) * sigma1Val + mu1Val y2 <- if(oneGrp) NULL else rt(N2[i], df=nuVal) * sigma2Val + mu2Val # Get posterior for simulated data: simChain <- BESTmcmc( y1, y2, numSavedSteps=mcmcLength, thinSteps=1, verbose=verbose > 1, rnd.seed=rnd.seed[i]) if (i <= showFirstNrep ) { # x11() # Deprecated as "platform specific" in R 3.1 dev.new() # Doesn't work properly in Rstudio: gives warning and plots to old device. plotAll(simChain, ROPEm=ROPEm, ROPEsd=ROPEsd, ROPEeff=ROPEeff, compValm=compValm) mtext(paste("Simulation", i), outer=TRUE, line=-1, font=4) } #simChain <- as.matrix(Bout) # Get the HDIs for each parameter: if(oneGrp) { HDIm <- hdi(simChain$mu, credMass=credMass) HDIsd <- hdi(simChain$sigma, credMass=credMass) mu0 <- if(is.null(compValm)) 0 else compValm HDIeff <- hdi((simChain$mu - mu0) / simChain$sigma, credMass=credMass) } else { HDIm <- hdi(simChain$mu1 - simChain$mu2, credMass=credMass) HDIsd = hdi(simChain$sigma1 - simChain$sigma2, credMass=credMass) HDIeff = hdi(( simChain$mu1 - simChain$mu2 ) / sqrt( ( simChain$sigma1^2 + simChain$sigma2^2 ) / 2 ), credMass=credMass) } # Assess which goals were achieved and tally them: goalTally <- goalTally + c( HDIm[1] > ROPEm[2], HDIm[2] < ROPEm[1], HDIm[1] > ROPEm[1] & HDIm[2] < ROPEm[2], HDIm[2] - HDIm[1] < maxHDIWm, HDIsd[1] > ROPEsd[2], HDIsd[2] < ROPEsd[1], HDIsd[1] > ROPEsd[1] & HDIsd[2] < ROPEsd[2], HDIsd[2] - HDIsd[1] < maxHDIWsd, HDIeff[1] > ROPEeff[2], HDIeff[2] < ROPEeff[1], HDIeff[1] > ROPEeff[1] & HDIeff[2] < ROPEeff[2], HDIeff[2] - HDIeff[1] < maxHDIWeff ) s1 = 1
- goalTally s2 = 1 + i - goalTally power[,1] = s1/(s1+s2) for ( j in which(wanted)) { power[j, 2:3] = hdi( qbeta , shape1=s1[j] , shape2=s2[j] ) } if(verbose > 1) { cat( "\nAfter", i, "Simulated Experiments, Posterior Probability of meeting each criterion is (mean and 95% CrI):\n" ) print(round(power[wanted, ], 3)) flush.console() } if(verbose == 1) setTxtProgressBar(pb, i/nRep) if(!is.null(saveName)) save( i , power , file=saveName ) } return(invisible(power)) |
}
=================================================================================
This file has the S3 generic 'hdi' function and a series of methods.
hdi <- function(object, credMass=0.95, ...) UseMethod("hdi")
hdi.default <- function(object, credMass=0.95, ...) { if(!is.numeric(object)) stop(paste("No applicable method for class", class(object))) if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") if(all(is.na(object))) return(c(lower = NAreal, upper = NAreal))
This is Mike's code from way back:
x <- sort(object) # also removes NAs n <- length(x)
exclude <- ceiling(n * (1 - credMass)) # Not always the same as...
exclude <- n - floor(n * credMass) # Number of values to exclude low.poss <- x[1:exclude] # Possible lower limits... upp.poss <- x[(n - exclude + 1):n] # ... and corresponding upper limits best <- which.min(upp.poss - low.poss) # Combination giving the narrowest interval result <- c(lower = low.poss[best], upper = upp.poss[best])
attr(result, "credMass") <- credMass return(result) }
hdi.matrix <- function(object, credMass=0.95, ...) { result <- apply(object, 2, hdi.default, credMass=credMass, ...) attr(result, "credMass") <- credMass return(result) }
hdi.data.frame <- function(object, credMass=0.95, ...) hdi.matrix(as.matrix(object), credMass=credMass, ...)
hdi.mcmc.list <- function(object, credMass=0.95, ...) hdi.matrix(as.matrix(object), credMass=credMass, ...)
hdi.bugs <- function(object, credMass=0.95, ...)
hdi.matrix(object$sims.matrix, credMass=credMass, ...)
hdi.rjags <- function(object, credMass=0.95, ...)
hdi.matrix(object$BUGSoutput$sims.matrix, credMass=credMass, ...)
hdi.function <- function(object, credMass=0.95, tol, ...) { if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") if(missing(tol)) tol <- 1e-8 if(class(try(object(0.5, ...), TRUE)) == "try-error") stop(paste("Incorrect arguments for the inverse cumulative density function", substitute(object)))
cf. code in Kruschke 2011 p630
intervalWidth <- function( lowTailPr , ICDF , credMass , ... ) { ICDF( credMass + lowTailPr , ... ) - ICDF( lowTailPr , ... ) } optInfo <- optimize( intervalWidth , c( 0 , 1.0 - credMass) , ICDF=object , credMass=credMass , tol=tol , ... ) HDIlowTailPr <- optInfo$minimum result <- c(lower = object( HDIlowTailPr , ... ) , upper = object( credMass + HDIlowTailPr , ... ) ) attr(result, "credMass") <- credMass return(result) }
hdi.density <- function(object, credMass=0.95, allowSplit=FALSE, ...) { if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") sorted = sort( object$y , decreasing=TRUE ) heightIdx = min( which( cumsum( sorted) >= sum(object$y) * credMass ) ) height = sorted[heightIdx] indices = which( object$y >= height )
HDImass = sum( object$y[indices] ) / sum(object$y)
gaps <- which(diff(indices) > 1) if(length(gaps) > 0 && !allowSplit) {
In this case, return shortest 95% CrI
warning("The HDI is discontinuous but allowSplit = FALSE; the result is a valid CrI but not HDI.") cumul <- cumsum(object$y) / sum(object$y) upp.poss <- low.poss <- which(cumul < 1 - credMass) for (i in low.poss) upp.poss[i] <- min(which(cumul > cumul[i] + credMass))
all(cumul[upp.poss] - cumul[low.poss] > credMass) # check
width <- upp.poss - low.poss best <- which(width == min(width)) # usually > 1 value due to ties result <- c(lower = mean(object$x[low.poss[best]]), upper = mean(object$x[upp.poss[best]])) } else { begs <- indices[c(1, gaps + 1)] ends <- indices[c(gaps, length(indices))] result <- cbind(begin = object$x[begs], end = object$x[ends]) if(!allowSplit) names(result) <- c("lower", "upper") } attr(result, "credMass") <- credMass attr(result, "height") <- height return(result) }
==========================================================================
plot.BEST <- function(x, which=c("mean", "sd", "effect", "nu"), credMass=0.95, ROPE=NULL, compVal=0, showCurve=FALSE, ...) {
This function plots the posterior distribution for one selected item.
Description of arguments:
x is mcmc.list object of the type returned by function BESTmcmc.
which indicates which item should be displayed; possible values are
"mean", "sd",
"effect" or "nu".
ROPE is a two element vector, such as c(-1,1), specifying the limit
of the ROPE.
compVal is a scalar specifying the value for comparison.
showCurve if TRUE the posterior should be displayed as a fitted
density curve
instead of a histogram (default).
|# TODO additional sanity checks. # Sanity checks: if(!inherits(x, "data.frame")) stop("x is not a valid BEST object") if(ncol(x) == 3 && all(colnames(x) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(x) == 5 && all(colnames(x) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("x is not a valid BEST object") } # Deal with ... argument dots <- list(...) if(length(dots) == 1 && class(dots[[1]]) == "list") dots <- dots[[1]] whichID <- match.arg(which) toPlot <- switch(whichID, mean = if(oneGrp) x$mu else x$mu1 - x$mu2, sd = if(oneGrp) x$sigma else x$sigma1 - x$sigma2, effect = if(oneGrp) (x$mu - compVal) / x$sigma else (x$mu1 - x$mu2) / sqrt( ( x$sigma1^2 + x$sigma2^2 ) / 2 ), nu = log10(x$nu) ) if(is.null(dots$main)) dots$main <- switch(whichID, mean = if(oneGrp) "Mean" else "Difference of Means", sd = if(oneGrp) "Standard Deviation" else "Difference of Std. Dev.s", effect = "Effect Size", nu = "Normality") if(is.null(dots$xlab)) dots$xlab <- switch(whichID, mean = if(oneGrp) bquote(mu) else bquote(mu[1] - mu[2]), sd = if(oneGrp) bquote(sigma) else bquote(sigma[1] - sigma[2]), effect = if(oneGrp) bquote( (mu-.(compVal)) / sigma ) else bquote( (mu[1]-mu[2]) / sqrt((sigma[1]^2 +sigma[2]^2 )/2 ) ), nu = bquote("log10("nu")")) if(whichID=="nu" && !is.null(compVal) && compVal == 0) compVal <- NULL if(whichID=="sd" && oneGrp && !is.null(compVal) && compVal == 0) compVal <- NULL # Plot posterior distribution of selected item: histinfo <- plotPost(toPlot, credMass=credMass, ROPE=ROPE, showCurve=showCurve, showMode=whichID != "mean", compVal=compVal, graphPars=dots) return(invisible(histinfo)) |
}
=======END of MM's Code===========================================
=======Kurt's Adaptation of BEST code==================================
Clear R memory and graphics
rm(list=ls())
graphics.off()
==============Prepare Data =================================
Import 2013 Medicare Top 100 DRG data
drg.2013=read.csv("MCare_Inpatient_DRG100_FY13.csv")
Assign Red, Blue indicator variable with ifelse (Thanks to Dr. Brad
Barney!)
drg.2013$Red <- ifelse(drg.2013$Provider.State %in% c("ID", "MT", "UT", "AZ", "AK", "OK", "WY", "ND", "SD", "NE", "KS", "OK", "TX", "MO", "AR", "LA", "IN", "TN", "KY", "MS", "AL", "GA", "WV", "NC", "SC" ),1,0)
Summarize entire population
summary(drg.2013)
Create separate data sets for red and blue states based on variable Red
Red states
red.drg.2013 <- drg.2013[which(drg.2013$Red==1),]
Blue states
blue.drg.2013 <- drg.2013[which(drg.2013$Red==0),]
Sample Red and Blue states to develop parameter priors
red.drg.2013.sample <- red.drg.2013[sample(1:nrow(red.drg.2013), 50, replace=FALSE),]
blue.drg.2013.sample <- blue.drg.2013[sample(1:nrow(blue.drg.2013), 50, replace=FALSE),]
Summarize Red and Blue samples
summary(red.drg.2013.sample) summary(blue.drg.2013.sample)
===================Execute BEST process steps 1, 2, 3, 4
See John K. Kruschke, 2013, Bayesian Estimation Supersedes the t Test
J. Experimental Psych: General
BEST Version of January 19, 2013
johnkruschke@gmail.com mailto:johnkruschke@gmail.com
http://www.indiana.edu/~kruschke/BEST/ http://www.indiana.edu/%7Ekruschke/BEST/
BEST 1 - Invoke BESTparallel & parallel libraries
& load BESTgamma.R functions into working memory
library(BEST) library(parallel) library(lattice) library(jagsUI) library(runjags) library(R.devices) source("BESTparallel.R")
BEST 2 - Load Red and Blue sample data into BEST vectors y1 and y2
Red states
y1 <- c(red.drg.2013.sample$Avg.Mcare.Pmnts)
Blue states
y2 <- c(blue.drg.2013.sample$Avg.Mcare.Pmnts)
BEST 3 - Generate MCMC chain
mcmcChain=BESTmcmc( y1, y2, priors=NULL, doPriorsOnly=FALSE, numSavedSteps=1e5, thinSteps=1, burnInSteps = 1000, verbose=TRUE, rnd.seed=NULL, parallel=NULL)
This function generates an MCMC sample from the posterior distribution.
y1, y2 the data vectors; y2=NULL if only one group.
priors is a list specifying priors to use.
verbose=FALSE suppresses output to the R Console.
rnd.seed is passed to each of the chains, with a different pseudorandom
number generator for each.
Returns a data frame, not a matrix, with class 'BEST',
with attributes Rhat, n.eff, a list with the original data, and the priors.
------------------------------------------------------------------------------
BEST 4 - Plot posterior results & show detailed summary info on console
postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) show( postInfo )
Save plots and data for future use:
Save the plots:
saveGraph( file="BEST.DRG.2013" , type="eps" ) saveGraph( file="BEST.DRG.2013" , type="jpeg" )
Save the data and results:
save( y1, y2, postInfo, mcmcChain, file="BEST.DRG.2013.MCMC.Rdata" )
To re-load the saved data and MCMC chain, type: load( "BEST.DRG.2013.MCMC.Rdata" )
— Reply to this email directly or view it on GitHub https://github.com/mikemeredith/BEST/issues/4.
Click here https://www.mailcontrol.com/sr/JL+lwS53Jd%21GX2PQPOmvUhBG2SIKse0TZfXFS8icgidWGmG8wj8%21jZ7ZkSVS%21X24+UNZSiU1HTn36NGKki70Ug== to report this email as spam.
Michael E Meredith Wildlife Conservation Society (WCS) Malaysia Program 7 Jalan Ridgeway, 93250 Kuching, Malaysia Mobile: +60-19 888 1533 email: mmeredith@wcs.org Program website: http://www.wcsmalaysia.org
Mike,
My compilation of the various functions in your BEST build, MODIFIED 6/29/15 MM BestMCMC w/parallel processing, appears below. Below that is my own BEST script. When I run these under the latest R build, it crashes, right after completing BESTmcmc, with this message: