mikemeredith / BEST

Bayesian Estimation Supersedes t-Test
31 stars 4 forks source link

BESTobj is not a valid BEST object #4

Closed kschulzke closed 8 years ago

kschulzke commented 8 years ago

Mike,

My compilation of the various functions in your BEST build, MODIFIED 6/29/15 MM BestMCMC w/parallel processing, appears below. Below that is my own BEST script. When I run these under the latest R build, it crashes, right after completing BESTmcmc, with this message:

postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) Error in plotAll(y1, y2, mcmcChain, ROPEeff = c(-0.1, 0.1)) : BESTobj is not a valid BEST object <<< huge amount of code deleted by MM>>>

mikemeredith commented 8 years ago

Hi jedijd,

Sorry, I'm not going to plough thro all the code you post to look for your problem. I hope you didn't expect me to.

You appear to be sourcing code from John Kruschke's web site. The code on my github repo is NOT the same and mixing the two will yield grief.

If you want to use my code on github, please just install the BEST package from CRAN and use that. Do check the help pages, if you do ?plotAll, you'll see that the first argument should be the output from BESTmcmc. Your line postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) is wrong.

HTH, Mike

On 15.11.15 05:49, jedijd wrote:

Mike,

My compilation of the various functions in your BEST build, MODIFIED 6/29/15 MM BestMCMC w/parallel processing, appears below. Below that is my own BEST script. When I run these under the latest R build, it crashes, right after completing BESTmcmc, with this message:

postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) )
Error in plotAll(y1, y2, mcmcChain, ROPEeff = c(-0.1, 0.1)) :
BESTobj is not a valid BEST object

John K. Kruschke & Mike Meredith

johnkruschke@gmail.com mailto:johnkruschke@gmail.com

http://www.indiana.edu/~kruschke/BEST/ http://www.indiana.edu/%7Ekruschke/BEST/

#

This program is believed to be free of errors, but it comes with no guarantee!

The user bears all responsibility for interpreting the results.

Please check the webpage above for updates or corrections.

#

  ***************************************************************

  ******** SEE FILE BESTgammaExample.R FOR INSTRUCTIONS **************

  ***************************************************************

source("openGraphSaveGraph.R") # graphics functions for Windows, Mac OS, Linux

=================================================================================

Function for shape and rate parameters of gamma. From DBDA2E-utilities.R; see

p. 238 of "Doing Bayesian Data Analysis" Second Edition,

https://sites.google.com/site/doingbayesiandataanalysis/

Modified by MM to accept mode/mean and sd as vectors

Not exported.

gammaShRaFromModeSD = function( mode , sd ) {

if ( mode <=0 ) stop("mode must be > 0")

if ( sd <=0 ) stop("sd must be > 0")

if ( any(mode <= 0) ) stop("mode of gamma prior must be > 0") if ( any(sd <= 0) ) stop("sd of gamma prior must be > 0") rate = ( mode + sqrt( mode^2 + 4 * sd^2 ) ) / ( 2 * sd^2 ) shape = 1 + mode * rate return( list( shape=shape , rate=rate ) ) }

gammaShRaFromMeanSD = function( mean , sd ) {

if ( mean <=0 ) stop("mean must be > 0")

if ( sd <=0 ) stop("sd must be > 0")

if ( any(mean <= 0) ) stop("mean of gamma prior must be > 0") if ( any(sd <= 0) ) stop("sd of gamma prior must be > 0") shape = mean^2/sd^2 rate = mean/sd^2 return( list( shape=shape , rate=rate ) ) }

  Parallel processing version using jagsUI::jags.basic

BESTmcmc <- function( y1, y2=NULL, priors=NULL, doPriorsOnly=FALSE, numSavedSteps=1e5, thinSteps=1, burnInSteps = 1000, verbose=TRUE, rnd.seed=NULL, parallel=NULL) {

This function generates an MCMC sample from the posterior distribution.

y1, y2 the data vectors; y2=NULL if only one group.

priors is a list specifying priors to use.

verbose=FALSE suppresses output to the R Console.

rnd.seed is passed to each of the chains, with a different pseudorandom

number generator for each.

Returns a data frame, not a matrix, with class 'BEST',

with attributes Rhat, n.eff, a list with the original data, and the

priors.

------------------------------------------------------------------------------

|if(doPriorsOnly && verbose) cat("Warning: The output shows the prior distributions, NOT the posterior distributions for your data.\n") # Parallel processing check nCores <- detectCores() if(!is.null(parallel) && parallel && nCores < 4) { if(verbose) warning("Not enough cores for parallel processing, running chains sequentially.") parallel <- FALSE } if(is.null(parallel)) parallel <- nCores > 3 # Data checks y <- c( y1 , y2 ) # combine data into one vector if(!all(is.finite(y))) stop("The input data include NA or Inf.") if(length(unique(y)) < 2 && # sd(y) will be 0 or NA; ok if priors specified. (is.null(priors) || is.null(priors$muSD) || is.null(priors$sigmaMode) || is.null(priors$sigmaSD))) stop("If priors are not specified, data must include at least 2 (non-equal) values.")

Prior checks: if(!is.null(priors)) { if(!is.list(priors)) {

if(is.numeric(priors)) { stop("'priors' is now the 3rd argument; it must be a list (or NULL).") } else { stop("'priors' must be a list (or NULL).") } } nameOK <- names(priors) %in% c("muM", "muSD", "sigmaMode", "sigmaSD", "nuMean", "nuSD") if(!all(nameOK)) stop("Invalid items in prior specification: ", paste(sQuote(names(priors)[!nameOK]), collapse=", ")) if(!all(sapply(priors, is.numeric))) stop("All items in 'priors' must be numeric.") if(!is.null(priors$muSD) && priors$muSD <= 0) stop("muSD must be > 0") } if(is.null(rnd.seed)) rnd.seed <- floor(runif(1,1,10000)) # THE PRIORS if(is.null(priors)) { # use the old prior specification dataForJAGS <- list( muM = mean(y) , muP = 0.000001 * 1/sd(y)^2 , sigmaLow = sd(y) / 1000 , sigmaHigh = sd(y) * 1000 ) } else { # use gamma priors priors0 <- list( # default priors muM = mean(y) , muSD = sd(y)_5 , sigmaMode = sd(y), sigmaSD = sd(y)_5, nuMean = 29, nuSD = 29 ) priors0 <- modifyList(priors0, priors) # user's priors take prior-ity (duh!!) # Convert to Shape/Rate sigmaShRa <- gammaShRaFromModeSD(mode=priors0$sigmaMode, sd=priors0$sigmaSD) nuShRa <- gammaShRaFromMeanSD(mean=priors0$nuMean, sd=priors0$nuSD) dataForJAGS <- list( muM = priors0$muM, muP = 1/priors0$muSD^2, # convert SD to precision Sh = sigmaShRa$shape, Ra = sigmaShRa$rate) if(!is.null(y2)) { # all the above must be vectors of length 2 fixPrior <- function(x) { if(length(x) < 2) x <- rep(x, 2) return(x) } dataForJAGS <- lapply(dataForJAGS, fixPrior) } dataForJAGS$ShNu <- nuShRa$shape dataForJAGS$RaNu <- nuShRa$rate } # THE MODEL. modelFile <- file.path(tempdir(), "BESTmodel.txt") if(is.null(priors)) { # use old broad priors if(is.null(y2)) { modelString = " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu , tau , nu ) } mu ~ dnorm( muM , muP ) tau <- 1/pow( sigma , 2 ) sigma ~ dunif( sigmaLow , sigmaHigh ) nu <- nuMinusOne+1 nuMinusOne ~ dexp(1/29) } " # close quote for modelString } else { modelString <- " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu[x[i]] , tau[x[i]] , nu ) } for ( j in 1:2 ) { mu[j] ~ dnorm( muM , muP ) tau[j] <- 1/pow( sigma[j] , 2 ) sigma[j] ~ dunif( sigmaLow , sigmaHigh ) } nu <- nuMinusOne+1 nuMinusOne ~ dexp(1/29) } " # close quote for modelString } } else { # use gamma priors if(is.null(y2)) { modelString = " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu , tau , nu ) } mu ~ dnorm( muM[1] , muP[1] ) tau <- 1/pow( sigma , 2 ) sigma ~ dgamma( Sh[1] , Ra[1] ) nu <- nuMinusOne+1 nuMinusOne ~ dgamma( ShNu , RaNu ) # prior for nu } " # close quote for modelString } else { modelString <- " model { for ( i in 1:Ntotal ) { y[i] ~ dt( mu[x[i]] , tau[x[i]] , nu ) } for ( j in 1:2 ) { mu[j] ~ dnorm( muM[j] , muP[j] ) tau[j] <- 1/pow( sigma[j] , 2 ) sigma[j] ~ dgamma( Sh[j] , Ra[j] ) } nu <- nuMinusOne+1 nuMinusOne ~ dgamma( ShNu , RaNu ) # prior for nu } " # close quote for modelString } } # Write out modelString to a text file writeLines( modelString , con=modelFile )

------------------------------------------------------------------------------

THE DATA. # dataForJAGS already has the priors, add the data:

if(!doPriorsOnly) dataForJAGS$y <- y dataForJAGS$Ntotal <- length(y) if(!is.null(y2)) # create group membership code dataForJAGS$x <- c( rep(1,length(y1)) , rep(2,length(y2)) )

------------------------------------------------------------------------------

INTIALIZE THE CHAINS. # Initial values of MCMC chains based on data:

if(is.null(y2)) { mu = mean(y1) sigma = sd(y1) } else { mu = c( mean(y1) , mean(y2) ) sigma = c( sd(y1) , sd(y2) ) } # Regarding initial values in next line: (1) sigma will tend to be too big if # the data have outliers, and (2) nu starts at 5 as a moderate value. These # initial values keep the burn-in period moderate. initsList0 <- list(mu=mu, sigma=sigma, nuMinusOne=4, .RNG.seed=rnd.seed) initsList <- list( c(initsList0, .RNG.name="base::Wichmann-Hill"), c(initsList0, .RNG.name="base::Marsaglia-Multicarry"), c(initsList0, .RNG.name="base::Super-Duper") )

------------------------------------------------------------------------------

RUN THE CHAINS codaSamples <- jags.basic( data = dataForJAGS, inits

= initsList, parameters.to.save = c( "mu" , "sigma" , "nu" ), # The parameters to be monitored model.file = modelFile, n.chains = 3, # Do not change this without also changing inits. n.adapt = 500, n.iter = ceiling( ( numSavedSteps * thinSteps) / 3 + burnInSteps ), n.burnin = burnInSteps, n.thin = thinSteps, modules = NULL, parallel = parallel, DIC = FALSE, seed = rnd.seed, verbose = verbose)

------------------------------------------------------------------------------

mcmcChain = as.matrix( codaSamples ) if(dim(mcmcChain)[2] == 5 && all(colnames(mcmcChain) == c("mu[1]", "mu[2]", "nu", "sigma[1]", "sigma[2]"))) colnames(mcmcChain) <- c("mu1", "mu2", "nu", "sigma1", "sigma2") mcmcDF <- as.data.frame(mcmcChain) class(mcmcDF) <- c("BEST", class(mcmcDF)) attr(mcmcDF, "call") <- match.call() attr(mcmcDF, "Rhat") <- gelman.diag(codaSamples)$psrf[, 1] attr(mcmcDF, "n.eff") <- effectiveSize(codaSamples) attr(mcmcDF, "data") <- list(y1 = y1, y2 = y2) attr(mcmcDF, "doPriorsOnly") <- doPriorsOnly if(!is.null(priors)) attr(mcmcDF, "priors") <- priors0 return( mcmcDF ) |

} # end function BESTmcmc

==============================================================================

summary.BEST <- function(object, credMass=0.95, ROPEm=NULL, ROPEsd=NULL, ROPEeff=NULL, compValm=0, compValsd=NULL, compValeff=0, ...) {

Produces summary stats for a BEST object.

Should do the same set of stats as plotAll and use the same syntax

as far as possible.

|# Sanity checks: if(!inherits(object, "data.frame")) stop("object is not a valid BEST object") #mcmcChain <- as.matrix(object) if(ncol(object) == 3) { oneGrp <- TRUE nparam <- 5 } else { if(ncol(object) != 5) stop("object is not a valid BEST object.") oneGrp <- FALSE nparam <- 9 } # Define matrix for storing summary info: summaryInfo = matrix(NA, nrow=nparam, ncol=11) if(oneGrp) { rownames(summaryInfo) <- c("mu", "sigma", "nu", "log10nu", "effSz") } else { rownames(summaryInfo) <- c("mu1", "mu2", "muDiff", "sigma1", "sigma2", "sigmaDiff", "nu", "log10nu", "effSz") } colnames(summaryInfo) <- c("mean","median","mode", "HDI%","HDIlo","HDIup", "compVal","%>compVal", "ROPElow","ROPEhigh","%InROPE") if(oneGrp) { # Deal with 1-group case: summaryInfo["mu", ] <- sumPost(object$mu, credMass=credMass, compVal=compValm, ROPE=ROPEm) summaryInfo["sigma", ] = sumPost(object$sigma, credMass=credMass, compVal=compValsd, ROPE=ROPEsd) mu0 <- if(is.null(compValm)) 0 else compValm effectSize <- (object$mu - mu0) / object$sigma summaryInfo["effSz", ] = sumPost(effectSize, credMass=credMass, compVal=compValeff, ROPE=ROPEeff) } else { summaryInfo["mu1", ] <- sumPost(object$mu1, credMass=credMass) summaryInfo["mu2", ] <- sumPost(object$mu2, credMass=credMass) summaryInfo["muDiff", ] <- sumPost(object$mu1 - object$mu2, credMass=credMass, compVal=compValm, ROPE=ROPEm) summaryInfo["sigma1", ] <- sumPost(object$sigma1, credMass=credMass) summaryInfo["sigma2", ] <- sumPost(object$sigma2, credMass=credMass) if(is.null(compValsd)) compValsd <- 0 summaryInfo["sigmaDiff", ] <- sumPost(object$sigma1 - object$sigma2, credMass=credMass, compVal=compValsd, ROPE=ROPEsd) effSzChain = ((object$mu1 - object$mu2) / sqrt((object$sigma1^2 + object$sigma2^2) / 2)) summaryInfo["effSz", ] = sumPost(effSzChain, credMass=credMass, compVal=compValeff, ROPE=ROPEeff) # This does not use sample-size weighted version of effect size: # N1 = length(y1) # N2 = length(y2) # effSz = (mu1 - mu2) / sqrt((sigma1^2 (N1-1) + sigma2^2 (N2-1)) # / (N1+N2-2)) } # Deal with nu: summaryInfo["nu", ] = sumPost(object$nu, credMass=credMass) summaryInfo["log10nu", ] = sumPost(log10(object$nu), credMass=credMass) class(summaryInfo) <- c("summary.BEST", class(summaryInfo)) return(summaryInfo) |

}

print.summary.BEST <- function(x, digits=3, ...) {

print method for summary.BEST

Remove all-NA columns:

ok <- apply(x, 2, function(y) !all(is.na(y))) class(x) <- NULL print.default(x[,ok], digits=digits, na.print="", ...) }

===========================

pairs.BEST <- function(x, nPtToPlot = 1000, col="skyblue", ...) {

Plot the parameters pairwise, to see correlations

|# Sanity checks: if(!inherits(x, "data.frame")) stop("x is not a valid BEST object") if(ncol(x) == 3 && all(colnames(x) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(x) == 5 && all(colnames(x) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("x is not a valid BEST object") } nuCol <- which(colnames(x) == "nu") mcmcChain <- cbind(x[, -nuCol], log10(x$nu)) #plotIdx = floor(seq(1, nrow(mcmcChain),by=nrow(mcmcChain)/nPtToPlot)) #TODO Use length.out plotIdx = floor(seq(1, nrow(mcmcChain), length.out=nPtToPlot)) #TODO Use length.out panel.cor = function(x, y, digits=2, prefix="", cex.cor, ...) { usr = par("usr"); on.exit(par(usr)) par(usr = c(0, 1, 0, 1)) r = (cor(x, y)) txt = format(c(r, 0.123456789), digits=digits)[1] txt = paste(prefix, txt, sep="") if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt) # cex.cor is now cruft? text(0.5, 0.5, txt, cex=1.25 ) # was cex=cex.cor*r } if(oneGrp) { labels <- c( expression(mu), expression(sigma), expression(log10(nu)) ) } else { labels <- c( expression(mu[1]) , expression(mu[2]) , expression(sigma[1]) , expression(sigma[2]) , expression(log10(nu)) ) } pairs( mcmcChain[plotIdx, ] , labels=labels, lower.panel=panel.cor, col=col, ... ) |

}

===Plot Posteriors

plotAll <- function(BESTobj, credMass=0.95, ROPEm=NULL, ROPEsd=NULL, ROPEeff=NULL, compValm=0, compValsd=NULL, compValeff=0, showCurve=FALSE, ...) {

This function plots the posterior distribution (and data). It does not

produce a scatterplot matrix; use pairs(...) for that.

Description of arguments:

BESTobj is BEST object of the type returned by function BESTmcmc.

ROPEm is a two element vector, such as c(-1,1), specifying the limit

of the ROPE on the difference of means.

ROPEsd is a two element vector, such as c(-1,1), specifying the limit

of the ROPE on the difference of standard deviations.

ROPEeff is a two element vector, such as c(-1,1), specifying the limit

of the ROPE on the effect size.

showCurve if TRUE the posterior should be displayed as a fitted

density curve

instead of a histogram (default).

|# Sanity checks: if(!inherits(BESTobj, "data.frame")) stop("BESTobj is not a valid BEST object") if(ncol(BESTobj) == 3 && all(colnames(BESTobj) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(BESTobj) == 5 && all(colnames(BESTobj) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("BESTobj is not a valid BEST object") } y1 <- attr(BESTobj, "data")$y1 y2 <- attr(BESTobj, "data")$y2 # Select thinned steps in chain for plotting of posterior predictive curves: chainLength <- NROW( BESTobj ) nCurvesToPlot <- 30 # stepIdxVec <- seq( 1 , chainLength , floor(chainLength/nCurvesToPlot) ) stepIdxVec <- seq(1, chainLength, length.out=nCurvesToPlot) if(oneGrp) { mu1 <- BESTobj$mu sigma1 <- BESTobj$sigma y2 <- mu2 <- sigma2 <- NULL } else { mu1 <- BESTobj$mu1 mu2 <- BESTobj$mu2 sigma1 <- BESTobj$sigma1 sigma2 <- BESTobj$sigma2 } nu <- BESTobj$nu # Set up window and layout: # windows(width=6.0,height=8.0) # Better to use default plot window. oldpar <- par(mar=c(3.5,3.5,2.5,0.5), mgp=c(2.25,0.7,0), "mfrow") on.exit(par(oldpar)) if(oneGrp) { layout( matrix( c(3,3,4,4,5,5, 1,1,1,1,2,2) , nrow=6, ncol=2 , byrow=FALSE ) ) } else { layout( matrix( c(4,5,7,8,3,1,2,6,9,10) , nrow=5, byrow=FALSE ) ) } if(is.null(y1) && is.null(y2)) { xLim <- range(mu1, mu2) } else { xRange <- range(y1, y2) xLim <- c( xRange[1]-0.1_diff(xRange) , xRange[2]+0.1_diff(xRange) ) } xVec <- seq( xLim[1] , xLim[2] , length=200 ) maxY <- max( dt( 0 , df=max(nu) ) / min(sigma1,sigma2) )

Plot data and smattering of posterior predictive curves:

plotDataPPC(y=y1, mu=mu1[stepIdxVec], sigma=sigma1[stepIdxVec], nu=nu[stepIdxVec], xVec=xVec, maxY=maxY) if(oneGrp) { title(main="Data w. Post. Pred.") if(!is.null(y1)) text( max(xVec) , maxY , bquote(N ==.(length(y1))) , adj=c(1.1,1.1) ) } else { title(main="Data Group 1 w. Post. Pred.") if(!is.null(y1)) text( max(xVec) , maxY , bquote(N[1]==.(length(y1))) , adj=c(1.1,1.1) ) } if(!oneGrp) { plotDataPPC(y=y2, mu=mu2[stepIdxVec], sigma=sigma2[stepIdxVec], nu=nu[stepIdxVec], xVec=xVec, maxY=maxY) title(main="Data Group 2 w. Post. Pred.") if(!is.null(y2)) text( max(xVec) , maxY , bquote(N[2]==.(length(y2))) , adj=c(1.1,1.1) ) } # Plot posterior distribution of parameter nu: plotPost( log10(nu) , col="skyblue" , credMass=credMass, showCurve=showCurve , xlab=bquote("log10("nu")") , cex.lab = 1.75 , showMode=TRUE , main="Normality" ) # (<0.7 suggests kurtosis) # Plot posterior distribution of parameters mu1, mu2, and their difference: xlim <- range( c( mu1 , mu2 ) ) if(oneGrp) { plotPost( mu1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , ROPE=ROPEm, compVal=compValm, xlab=bquote(mu) , main="Mean" , col="skyblue" ) } else { plotPost( mu1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(mu[1]) , main=paste("Group",1,"Mean") , col="skyblue" ) plotPost( mu2 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(mu[2]) , main=paste("Group",2,"Mean") , col="skyblue" ) plotPost( mu1-mu2 , compVal=0 , showCurve=showCurve , credMass=credMass, xlab=bquote(mu[1]

  • mu[2]) , cex.lab = 1.75 , ROPE=ROPEm , main="Difference of Means" , col="skyblue" ) } # Plot posterior distribution of param's sigma1, sigma2, and their difference: xlim <- range( c( sigma1 , sigma2 ) ) if(oneGrp) { plotPost(sigma1, xlim=xlim, cex.lab = 1.75, credMass=credMass, showCurve=showCurve, ROPE=ROPEsd, compVal=compValsd, xlab=bquote(sigma) , main="Std. Dev." , col="skyblue" , showMode=TRUE ) } else { plotPost( sigma1 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(sigma[1]) , main=paste("Group",1,"Std. Dev.") , col="skyblue" , showMode=TRUE ) plotPost( sigma2 , xlim=xlim , cex.lab = 1.75 , credMass=credMass, showCurve=showCurve , xlab=bquote(sigma[2]) , main=paste("Group",2,"Std. Dev.") , col="skyblue" , showMode=TRUE ) plotPost( sigma1-sigma2 , credMass=credMass, compVal=compValsd , showCurve=showCurve , xlab=bquote(sigma[1] - sigma[2]) , cex.lab = 1.75 , ROPE=ROPEsd , main="Difference of Std. Dev.s" , col="skyblue" , showMode=TRUE ) } # Effect size for 1 group: if(oneGrp) { effectSize = ( mu1 - compValm ) / sigma1 plotPost( effectSize , compVal=compValeff , ROPE=ROPEeff , credMass=credMass, showCurve=showCurve , xlab=bquote( (mu-.(compValm)) / sigma ), showMode=TRUE , cex.lab=1.75 , main="Effect Size" , col="skyblue" ) } else { # Plot of estimated effect size. Effect size is d-sub-a from # Macmillan & Creelman, 1991; Simpson & Fitter, 1973; Swets, 1986a, 1986b. effectSize <- ( mu1 - mu2 ) / sqrt( ( sigma1^2 + sigma2^2 ) / 2 ) plotPost( effectSize , compVal=compValeff , ROPE=ROPEeff , credMass=credMass, showCurve=showCurve , xlab=bquote( (mu[1]-mu[2]) /sqrt((sigma[1]^2 +sigma[2]^2 )/2 ) ), showMode=TRUE , cex.lab=1.0 , main="Effect Size" , col="skyblue" ) } # Or use sample-size weighted version: # Hedges 1981; Wetzels, Raaijmakers, Jakab & Wagenmakers 2009. # N1 = length(y1) # N2 = length(y2) # effectSize = ( mu1 - mu2 ) / sqrt( ( sigma1^2 (N1-1) + sigma2^2 (N2-1) ) # / (N1+N2-2) ) # Be sure also to change BESTsummary function, above. # histInfo = plotPost( effectSize , compVal=0 , ROPE=ROPEeff , # showCurve=showCurve , # xlab=bquote( (mu[1]-mu[2]) # /sqrt((sigma[1]^2 (N[1]-1)+sigma[2]^2 (N[2]-1))/(N[1]+N[2]-2)) ), # showMode=TRUE , cex.lab=1.0 , main="Effect Size" , col="skyblue" ) return(invisible(NULL)) |

}

-------------------------------------------------------------------------------

plotDataPPC <- function(y, mu, sigma, nu, xVec, maxY) {

Does the plots of posterior predictive curves for one sample; called

by plotAll;

not exported.

Does not do title or sample size: those are added later.

y = original data for this sample; can be NULL

mu, sigma, nu = vectors of parameters to use for the t-curves and

histogram breaks.

xVec = vector of values to use for the x-axis values

maxY = height of the y-axis

|plot(xVec[1], 0, xlim=range(xVec), ylim=c(0, maxY), cex.lab=1.75, type="n", xlab="y", ylab="p(y)") for ( i in seq_along(mu)) { lines(xVec, dt( (xVec-mu[i])/sigma[i], df=nu[i] )/sigma[i], col="skyblue") } if(!is.null(y)) { histBinWd <- median(sigma)/2 histCenter <- mean(mu) histBreaks <- try(sort( c( seq( histCenter-histBinWd/2 , min(xVec)-histBinWd,#/2 , -histBinWd ), seq( histCenter+histBinWd/2 , max(xVec)+histBinWd,#/2 , histBinWd ) ) ), silent=TRUE) if(inherits(histBreaks, "try-error")) { text(min(xVec), maxY, "Cannot plot data.", pos=4, col='red') } else { histInfo <- hist( y, plot=FALSE , breaks=histBreaks ) PlotMat <- cbind(histInfo$mids, histInfo$density) PlotMat[histInfo$density == 0] <- NA points( PlotMat, type="h" , lwd=3 , col="red" ) } } |

}

================================================================================

postPriorOverlap <- function( paramSampleVec, prior, ..., yaxt="n", ylab="", xlab="Parameter", main="", cex.lab=1.5, cex=1.4, xlim=range(paramSampleVec), breaks=NULL) {

|# Does a posterior histogram for a single parameter, adds the prior, # displays and calculates the overlap. # Returns the overlap. oldpar <- par(xpd=NA) ; on.exit(par(oldpar)) # get breaks: a sensible number over the hdi; cover the full range (and no more); # equal spacing. if (is.null(breaks)) { nbreaks <- ceiling(diff(range(paramSampleVec)) / as.numeric(diff(hdi(paramSampleVec))/18)) breaks <- seq(from=min(paramSampleVec), to=max(paramSampleVec), length.out=nbreaks) } # plot posterior histogram. histinfo <- hist(paramSampleVec, xlab=xlab, yaxt=yaxt, ylab=ylab, freq=FALSE, border='white', col='skyblue', xlim=xlim, main=main, cex=cex, cex.lab=cex.lab, breaks=breaks) if (is.numeric(prior)) { # plot the prior if it's numeric priorInfo <- hist(prior, breaks=c(-Inf, breaks, Inf), add=TRUE, freq=FALSE, col='yellow', border='white')$density[2:length(breaks)] } else if (is.function(prior)) { if(class(try(prior(0.5, ...), TRUE)) == "try-error") stop(paste("Incorrect arguments for the density function", substitute(prior))) priorInfo <- prior(histinfo$mids, ...) } # get (and plot) the overlap minHt <- pmin(priorInfo, histinfo$density) rect(breaks[-length(breaks)], rep(0, length(breaks)-1), breaks[-1], minHt, col='green', border='white') overlap <- sum(minHt * diff(histinfo$breaks)) # Add curve if prior is a function if (is.function(prior)) lines(histinfo$mids, priorInfo, lwd=2, col='brown') # Add text text(mean(breaks), 0, paste0("overlap = ", round(overlap*100), "%"), pos=3, cex=cex) return(overlap) |

}

===================================================================================

sumPost <- function(paramSampleVec, credMass=0.95, compVal=NULL, ROPE=NULL) {

Gets summary information for a single parameter;

called by summary.BEST; not exported.

postSummary <- rep(NA, 11) names(postSummary) <- c("mean","median","mode", "hdiMass","hdiLow","hdiHigh", "compVal","pcGTcompVal", "ROPElow","ROPEhigh","pcInROPE")

postSummary["mean"] <- mean(paramSampleVec) postSummary["median"] <- median(paramSampleVec) mcmcDensity <- density(paramSampleVec) postSummary["mode"] <- mcmcDensity$x[which.max(mcmcDensity$y)]

|HDI <- hdi(paramSampleVec, credMass) postSummary["hdiMass"] <- credMass * 100 postSummary["hdiLow"] <- HDI[1] postSummary["hdiHigh"] <- HDI[2] if (!is.null(compVal)) { postSummary["compVal"] <- compVal postSummary["pcGTcompVal"] <- mean(paramSampleVec > compVal) * 100 } if (!is.null(ROPE)) { postSummary["ROPElow"] <- ROPE[1] postSummary["ROPEhigh"] <- ROPE[2] postSummary["pcInROPE"] <- mean(paramSampleVec > ROPE[1] & paramSampleVec < ROPE[2]) * 100 } return(postSummary) |

}

======================================================================

BESTpower <- function( BESTobj, N1, N2, credMass=0.95, ROPEm, ROPEsd, ROPEeff, maxHDIWm, maxHDIWsd, maxHDIWeff, compValm=0, nRep=200, mcmcLength=10000, saveName=NULL, showFirstNrep=0, verbose=2, rnd.seed=NULL) {

This function estimates power.

|# Sanity checks: if(!inherits(BESTobj, "data.frame")) stop("BESTobj is not a valid BEST object") if(ncol(BESTobj) == 3 && all(colnames(BESTobj) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(BESTobj) == 5 && all(colnames(BESTobj) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("BESTobj is not a valid BEST object") } chainLength = NROW( BESTobj ) if(chainLength < nRep) stop(paste("BESTobj does not have enough values; needs", nRep)) if(credMass <= 0 || credMass >= 1) stop("credMass must lie between 0 and 1.") if(missing(N1)) N1 <- length(attr(BESTobj, "data")$y1) N1 <- rep(N1, length.out=nRep)

if(!oneGrp && length(N2) < nRep) if(!oneGrp) { if(missing(N2)) N2 <-

length(attr(BESTobj, "data")$y2) N2 <- rep(N2, length.out=nRep) } # Deal with missing or invalid arguments for criteria: wanted <- rep(TRUE, 12) if(missing(ROPEm) || length(ROPEm) != 2) { wanted[1:3] <- FALSE ROPEm <- c(NAreal, NAreal) } if(missing(ROPEsd) || length(ROPEsd) != 2) { wanted[5:7] <- FALSE ROPEsd <- c(NAreal, NAreal) } if(missing(ROPEeff) || length(ROPEeff) != 2) { wanted[9:11] <- FALSE ROPEeff <- c(NAreal, NAreal) } if(missing(maxHDIWm) || maxHDIWm <= 0) { wanted[4] <- FALSE maxHDIWm <- NAreal } if(missing(maxHDIWsd) || maxHDIWsd <= 0) { wanted[8] <- FALSE maxHDIWsd <- NAreal } if(missing(maxHDIWeff) || maxHDIWeff <= 0) { wanted[12] <- FALSE maxHDIWeff <- NAreal } if(!any(wanted)) stop("No valid criteria set.") # Deal with random number seeds if(!is.null(rnd.seed)) { set.seed(rnd.seed[1], "Mersenne-Twister") if(length(rnd.seed) != nRep) rnd.seed <- sample.int(1e6, nRep) on.exit(set.seed(NULL, "default")) } # Select thinned steps in chain for posterior predictions: stepIdxVec = seq( 1 , chainLength , floor(chainLength/nRep) )[1:nRep] paramDF <- BESTobj[stepIdxVec, ] goalTally <- numeric(12) power <- matrix(NA, 12, 3) colnames(power) <- c("mean", "CrIlo", "CrIhi") # "CrI", cos too many HDIs already! rownames(power) <- c( " mean: HDI > ROPE", " mean: HDI < ROPE", " mean: HDI in ROPE", " mean: HDI width ok", " sd: HDI > ROPE", " sd: HDI < ROPE", " sd: HDI in ROPE", " sd: HDI width ok", "effect: HDI > ROPE", "effect: HDI < ROPE", "effect: HDI in ROPE", "effect: HDI width ok") if(verbose == 1) { pb <- txtProgressBar(style = 3) on.exit(close(pb)) } for (i in 1:nRep) { if(verbose > 1) { cat( "\n:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::\n" ) cat( paste( "Power computation: Simulated Experiment" , i , "of" , nRep , ":\n\n" ) ) flush.console() } # Get parameter values for this simulation: if(oneGrp) { mu1Val = paramDF[i,"mu"] sigma1Val = paramDF[i,"sigma"] } else { mu1Val = paramDF[i,"mu1"] mu2Val = paramDF[i,"mu2"] sigma1Val = paramDF[i,"sigma1"] sigma2Val = paramDF[i,"sigma2"] } nuVal = paramDF[i,"nu"] # Generate simulated data: y1 <- rt(N1[i], df=nuVal) * sigma1Val + mu1Val y2 <- if(oneGrp) NULL else rt(N2[i], df=nuVal) * sigma2Val + mu2Val # Get posterior for simulated data: simChain <- BESTmcmc( y1, y2, numSavedSteps=mcmcLength, thinSteps=1, verbose=verbose > 1, rnd.seed=rnd.seed[i]) if (i <= showFirstNrep ) { # x11() # Deprecated as "platform specific" in R 3.1 dev.new() # Doesn't work properly in Rstudio: gives warning and plots to old device. plotAll(simChain, ROPEm=ROPEm, ROPEsd=ROPEsd, ROPEeff=ROPEeff, compValm=compValm) mtext(paste("Simulation", i), outer=TRUE, line=-1, font=4) } #simChain <- as.matrix(Bout) # Get the HDIs for each parameter: if(oneGrp) { HDIm <- hdi(simChain$mu, credMass=credMass) HDIsd <- hdi(simChain$sigma, credMass=credMass) mu0 <- if(is.null(compValm)) 0 else compValm HDIeff <- hdi((simChain$mu - mu0) / simChain$sigma, credMass=credMass) } else { HDIm <- hdi(simChain$mu1 - simChain$mu2, credMass=credMass) HDIsd = hdi(simChain$sigma1 - simChain$sigma2, credMass=credMass) HDIeff = hdi(( simChain$mu1 - simChain$mu2 ) / sqrt( ( simChain$sigma1^2 + simChain$sigma2^2 ) / 2 ), credMass=credMass) } # Assess which goals were achieved and tally them: goalTally <- goalTally + c( HDIm[1] > ROPEm[2], HDIm[2] < ROPEm[1], HDIm[1] > ROPEm[1] & HDIm[2] < ROPEm[2], HDIm[2] - HDIm[1] < maxHDIWm, HDIsd[1] > ROPEsd[2], HDIsd[2] < ROPEsd[1], HDIsd[1] > ROPEsd[1] & HDIsd[2] < ROPEsd[2], HDIsd[2] - HDIsd[1] < maxHDIWsd, HDIeff[1] > ROPEeff[2], HDIeff[2] < ROPEeff[1], HDIeff[1] > ROPEeff[1] & HDIeff[2] < ROPEeff[2], HDIeff[2] - HDIeff[1] < maxHDIWeff ) s1 = 1

  • goalTally s2 = 1 + i - goalTally power[,1] = s1/(s1+s2) for ( j in which(wanted)) { power[j, 2:3] = hdi( qbeta , shape1=s1[j] , shape2=s2[j] ) } if(verbose > 1) { cat( "\nAfter", i, "Simulated Experiments, Posterior Probability of meeting each criterion is (mean and 95% CrI):\n" ) print(round(power[wanted, ], 3)) flush.console() } if(verbose == 1) setTxtProgressBar(pb, i/nRep) if(!is.null(saveName)) save( i , power , file=saveName ) } return(invisible(power)) |

}

=================================================================================

This file has the S3 generic 'hdi' function and a series of methods.

hdi <- function(object, credMass=0.95, ...) UseMethod("hdi")

hdi.default <- function(object, credMass=0.95, ...) { if(!is.numeric(object)) stop(paste("No applicable method for class", class(object))) if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") if(all(is.na(object))) return(c(lower = NAreal, upper = NAreal))

This is Mike's code from way back:

x <- sort(object) # also removes NAs n <- length(x)

exclude <- ceiling(n * (1 - credMass)) # Not always the same as...

exclude <- n - floor(n * credMass) # Number of values to exclude low.poss <- x[1:exclude] # Possible lower limits... upp.poss <- x[(n - exclude + 1):n] # ... and corresponding upper limits best <- which.min(upp.poss - low.poss) # Combination giving the narrowest interval result <- c(lower = low.poss[best], upper = upp.poss[best])

attr(result, "credMass") <- credMass return(result) }

hdi.matrix <- function(object, credMass=0.95, ...) { result <- apply(object, 2, hdi.default, credMass=credMass, ...) attr(result, "credMass") <- credMass return(result) }

hdi.data.frame <- function(object, credMass=0.95, ...) hdi.matrix(as.matrix(object), credMass=credMass, ...)

hdi.mcmc.list <- function(object, credMass=0.95, ...) hdi.matrix(as.matrix(object), credMass=credMass, ...)

hdi.bugs <- function(object, credMass=0.95, ...)

hdi.matrix(object$sims.matrix, credMass=credMass, ...)

hdi.rjags <- function(object, credMass=0.95, ...)

hdi.matrix(object$BUGSoutput$sims.matrix, credMass=credMass, ...)

hdi.function <- function(object, credMass=0.95, tol, ...) { if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") if(missing(tol)) tol <- 1e-8 if(class(try(object(0.5, ...), TRUE)) == "try-error") stop(paste("Incorrect arguments for the inverse cumulative density function", substitute(object)))

cf. code in Kruschke 2011 p630

intervalWidth <- function( lowTailPr , ICDF , credMass , ... ) { ICDF( credMass + lowTailPr , ... ) - ICDF( lowTailPr , ... ) } optInfo <- optimize( intervalWidth , c( 0 , 1.0 - credMass) , ICDF=object , credMass=credMass , tol=tol , ... ) HDIlowTailPr <- optInfo$minimum result <- c(lower = object( HDIlowTailPr , ... ) , upper = object( credMass + HDIlowTailPr , ... ) ) attr(result, "credMass") <- credMass return(result) }

hdi.density <- function(object, credMass=0.95, allowSplit=FALSE, ...) { if(is.na(credMass) || length(credMass) != 1 || credMass <= 0 || credMass >= 1) stop("credMass must be in 0 < credMass < 1") sorted = sort( object$y , decreasing=TRUE ) heightIdx = min( which( cumsum( sorted) >= sum(object$y) * credMass ) ) height = sorted[heightIdx] indices = which( object$y >= height )

HDImass = sum( object$y[indices] ) / sum(object$y)

gaps <- which(diff(indices) > 1) if(length(gaps) > 0 && !allowSplit) {

In this case, return shortest 95% CrI

warning("The HDI is discontinuous but allowSplit = FALSE; the result is a valid CrI but not HDI.") cumul <- cumsum(object$y) / sum(object$y) upp.poss <- low.poss <- which(cumul < 1 - credMass) for (i in low.poss) upp.poss[i] <- min(which(cumul > cumul[i] + credMass))

all(cumul[upp.poss] - cumul[low.poss] > credMass) # check

width <- upp.poss - low.poss best <- which(width == min(width)) # usually > 1 value due to ties result <- c(lower = mean(object$x[low.poss[best]]), upper = mean(object$x[upp.poss[best]])) } else { begs <- indices[c(1, gaps + 1)] ends <- indices[c(gaps, length(indices))] result <- cbind(begin = object$x[begs], end = object$x[ends]) if(!allowSplit) names(result) <- c("lower", "upper") } attr(result, "credMass") <- credMass attr(result, "height") <- height return(result) }

==========================================================================

plot.BEST <- function(x, which=c("mean", "sd", "effect", "nu"), credMass=0.95, ROPE=NULL, compVal=0, showCurve=FALSE, ...) {

This function plots the posterior distribution for one selected item.

Description of arguments:

x is mcmc.list object of the type returned by function BESTmcmc.

which indicates which item should be displayed; possible values are

"mean", "sd",

"effect" or "nu".

ROPE is a two element vector, such as c(-1,1), specifying the limit

of the ROPE.

compVal is a scalar specifying the value for comparison.

showCurve if TRUE the posterior should be displayed as a fitted

density curve

instead of a histogram (default).

|# TODO additional sanity checks. # Sanity checks: if(!inherits(x, "data.frame")) stop("x is not a valid BEST object") if(ncol(x) == 3 && all(colnames(x) == c("mu","nu","sigma"))) { oneGrp <- TRUE } else if (ncol(x) == 5 && all(colnames(x) == c("mu1", "mu2","nu","sigma1","sigma2"))) { oneGrp <- FALSE } else { stop("x is not a valid BEST object") } # Deal with ... argument dots <- list(...) if(length(dots) == 1 && class(dots[[1]]) == "list") dots <- dots[[1]] whichID <- match.arg(which) toPlot <- switch(whichID, mean = if(oneGrp) x$mu else x$mu1 - x$mu2, sd = if(oneGrp) x$sigma else x$sigma1 - x$sigma2, effect = if(oneGrp) (x$mu - compVal) / x$sigma else (x$mu1 - x$mu2) / sqrt( ( x$sigma1^2 + x$sigma2^2 ) / 2 ), nu = log10(x$nu) ) if(is.null(dots$main)) dots$main <- switch(whichID, mean = if(oneGrp) "Mean" else "Difference of Means", sd = if(oneGrp) "Standard Deviation" else "Difference of Std. Dev.s", effect = "Effect Size", nu = "Normality") if(is.null(dots$xlab)) dots$xlab <- switch(whichID, mean = if(oneGrp) bquote(mu) else bquote(mu[1] - mu[2]), sd = if(oneGrp) bquote(sigma) else bquote(sigma[1] - sigma[2]), effect = if(oneGrp) bquote( (mu-.(compVal)) / sigma ) else bquote( (mu[1]-mu[2]) / sqrt((sigma[1]^2 +sigma[2]^2 )/2 ) ), nu = bquote("log10("nu")")) if(whichID=="nu" && !is.null(compVal) && compVal == 0) compVal <- NULL if(whichID=="sd" && oneGrp && !is.null(compVal) && compVal == 0) compVal <- NULL # Plot posterior distribution of selected item: histinfo <- plotPost(toPlot, credMass=credMass, ROPE=ROPE, showCurve=showCurve, showMode=whichID != "mean", compVal=compVal, graphPars=dots) return(invisible(histinfo)) |

}

=======END of MM's Code===========================================

=======Kurt's Adaptation of BEST code==================================

Clear R memory and graphics

rm(list=ls())

graphics.off()

==============Prepare Data =================================

Import 2013 Medicare Top 100 DRG data

drg.2013=read.csv("MCare_Inpatient_DRG100_FY13.csv")

Assign Red, Blue indicator variable with ifelse (Thanks to Dr. Brad

Barney!)

drg.2013$Red <- ifelse(drg.2013$Provider.State %in% c("ID", "MT", "UT", "AZ", "AK", "OK", "WY", "ND", "SD", "NE", "KS", "OK", "TX", "MO", "AR", "LA", "IN", "TN", "KY", "MS", "AL", "GA", "WV", "NC", "SC" ),1,0)

Summarize entire population

summary(drg.2013)

Create separate data sets for red and blue states based on variable Red

Red states

red.drg.2013 <- drg.2013[which(drg.2013$Red==1),]

Blue states

blue.drg.2013 <- drg.2013[which(drg.2013$Red==0),]

Sample Red and Blue states to develop parameter priors

red.drg.2013.sample <- red.drg.2013[sample(1:nrow(red.drg.2013), 50, replace=FALSE),]

blue.drg.2013.sample <- blue.drg.2013[sample(1:nrow(blue.drg.2013), 50, replace=FALSE),]

Summarize Red and Blue samples

summary(red.drg.2013.sample) summary(blue.drg.2013.sample)

===================Execute BEST process steps 1, 2, 3, 4

See John K. Kruschke, 2013, Bayesian Estimation Supersedes the t Test

J. Experimental Psych: General

BEST Version of January 19, 2013

johnkruschke@gmail.com mailto:johnkruschke@gmail.com

http://www.indiana.edu/~kruschke/BEST/ http://www.indiana.edu/%7Ekruschke/BEST/

BEST 1 - Invoke BESTparallel & parallel libraries

& load BESTgamma.R functions into working memory

library(BEST) library(parallel) library(lattice) library(jagsUI) library(runjags) library(R.devices) source("BESTparallel.R")

BEST 2 - Load Red and Blue sample data into BEST vectors y1 and y2

Red states

y1 <- c(red.drg.2013.sample$Avg.Mcare.Pmnts)

Blue states

y2 <- c(blue.drg.2013.sample$Avg.Mcare.Pmnts)

BEST 3 - Generate MCMC chain

mcmcChain=BESTmcmc( y1, y2, priors=NULL, doPriorsOnly=FALSE, numSavedSteps=1e5, thinSteps=1, burnInSteps = 1000, verbose=TRUE, rnd.seed=NULL, parallel=NULL)

This function generates an MCMC sample from the posterior distribution.

y1, y2 the data vectors; y2=NULL if only one group.

priors is a list specifying priors to use.

verbose=FALSE suppresses output to the R Console.

rnd.seed is passed to each of the chains, with a different pseudorandom

number generator for each.

Returns a data frame, not a matrix, with class 'BEST',

with attributes Rhat, n.eff, a list with the original data, and the priors.

------------------------------------------------------------------------------

BEST 4 - Plot posterior results & show detailed summary info on console

postInfo = plotAll( y1 , y2 , mcmcChain , ROPEeff=c(-0.1,0.1) ) show( postInfo )

Save plots and data for future use:

Save the plots:

saveGraph( file="BEST.DRG.2013" , type="eps" ) saveGraph( file="BEST.DRG.2013" , type="jpeg" )

Save the data and results:

save( y1, y2, postInfo, mcmcChain, file="BEST.DRG.2013.MCMC.Rdata" )

To re-load the saved data and MCMC chain, type: load( "BEST.DRG.2013.MCMC.Rdata" )

— Reply to this email directly or view it on GitHub https://github.com/mikemeredith/BEST/issues/4.

Click here https://www.mailcontrol.com/sr/JL+lwS53Jd%21GX2PQPOmvUhBG2SIKse0TZfXFS8icgidWGmG8wj8%21jZ7ZkSVS%21X24+UNZSiU1HTn36NGKki70Ug== to report this email as spam.

Michael E Meredith Wildlife Conservation Society (WCS) Malaysia Program 7 Jalan Ridgeway, 93250 Kuching, Malaysia Mobile: +60-19 888 1533 email: mmeredith@wcs.org Program website: http://www.wcsmalaysia.org