richarddmorey / BayesFactor

BayesFactor R package for Bayesian data analysis with common statistical models.
https://richarddmorey.github.io/BayesFactor/
131 stars 48 forks source link

practical questions related to meta.ttestBF #106

Closed AWKruijt closed 6 years ago

AWKruijt commented 7 years ago

Hi!

Though a tad unsure if this is the right place I'd like to solicit some input on the correct use of meta.ttestBF. Specifically: I'd like to follow-up a NHST MA testing a 'one sample against mu = 0' contrast with a Bayesian analysis, but I am quite the novice still to Bayes magic.

meat.ttestBF takes the following form: meta.ttestBF(t, n1, n2 = NULL, nullInterval = NULL, rscale = "medium",posterior = FALSE, callback = function(...) as.integer(0), ...)

For each of my included records I have the n as well as the mean and SD for a score on an interval scale for which both positive and negative values are possible. I've converted the latter to t-values for input in meta.ttestBF. So far so good, the point-nil test runs smoothly, but I'd like to test an interval around null (extending on either side), and this is where question marks start popping up. So for instance I'd like to test the c(-1 , 1) interval, except that I can't just input that, as the results would be nonsensical testing whether the mean is in that interval vs whether it is point nil which itself is part of the interval. Once I realised that, I found this incredibly useful paper: https://www.ncbi.nlm.nih.gov/pubmed/21787084 which had me all happy and trying to figure out the Bayes magic to construct the described BFs. I'll have a follow-up question on verifying that I do so correctly but there's two others issues I'd like to ask input on first:

I'd be much obliged if you could get me back on track here, as I've definitely started to do the running in circles thing.

All best,

Anne-Wil

richarddmorey commented 6 years ago

From a post answered by Richard Morey elsewhere, I now gather that the interval needs to be defined in standard scores...

Yes, this is correct. The boundaries are on standardised effect sizes (so, think: how many population standard deviations does this effect the mean?)

Do I need to worry/do something about these?

No, not generally.

AWKruijt commented 6 years ago

Dear Richard,

Thanks for this!

Here was another pretty long post explaining my core problem: is it possible to define the null-interval boundary to represent an absolute value on the original measurement scale (so: my original scale is in ms, say that I want to test if the mean is > 1 ms from zero: for this I'd have to convert '1' to a standardized value representing 1, but I seemed utterly unable to figure out how)... but I think've just now (finally) seen the light! \o/

In order to test an interval start at value 1 on the original scale, I was doing:

meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval = c(((1 - wmean)/wsd),Inf)) 

where wmean and wsd are the weighted mean and sd for each record's value of the original value.

whereas I should have been doing:

meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval = c( (1 /wsd),Inf)) 

as illustrated by:

plot(MaDF$mean_sample, (MaDF$mean_sample/wsd)) + abline(h = (1/wsd), v = 1)

I am all happy, now onto constructing notched type BFs.

As said, I may have follow-up questions on that but I now will first just try to see if my attempts so far start making more sense now that I understand how to standardize >.<

Thanks again! It really was your rephrasing of the approach "so, think: how many population standard deviations does this effect the mean?)" that helped switch the light.

AWKruijt commented 6 years ago

Ok.. back again. After clearly making things way too difficult for myself in the first step, the next step now seems deceptively simple to me, so forwarding these for checking. My aim is to figure out both the NOH (non-overlapping hypothesis) and the hybrid (pi=1) model BFs as described in "Bayes Factor Approaches for Testing Interval Null Hypotheses" by Morey & Rouder (DOI: 10.1037/a0024377).

Am I correct to think that the hybrid model is as simple as looking at the BF for the complement of a (-1,1) interval (versus d=0) ? I.e:

BFhybridpi_1 <- meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval=c( (-1/wsd), (1/wsd)))[2]

Then, for the NOH model I currently have the following code down:

# H0: m = point nil
# H1: m in interval c(-1,1)
# H2: m < -1 OR m > 1 , i.e the complement of H1 

meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval=c(-1,1))

BF10 <- as.vector(meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval=c(-1,1)))[1]
BF02 <- 1/as.vector(meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval=c(-1,1)))[2]

BF12 <- BF10 * BF02
BF12

Now... there's a part in the paper that would have me think that this is the Overlapping Hypothesis model, yet based on whatever little I understand from transitivity I can't stop thinking that the 'd=0' hypothesis got dropped in " BF10 BF02 " so that the resulting BF12 is* comparing the probability of the interval vs not the interval, as intended. Yet.. it seems too simple but I am stuck again so therefore asking input once more :) (and of course I also take the message of the paper that the hybrid model may well be preferred in my situation to heart, but at this point I just really want to understand this).

All best,

Anne

richarddmorey commented 6 years ago

It's even simpler than that. For the NOH BF, you can just do:

B <- meta.ttestBF(MaDF$t_0, MaDF$n_total, nullInterval=c(-1,1))
B[1] / B[2]

There's no need for the rest of the code.

AWKruijt commented 6 years ago

Thanks! Yes, I realised that just this morning >.< Baby steps - there's something inherently confusing to me about quite a lot of this just being way simpler/basic intuitive than I had expected. Yet I'm enjoying this. By now I am completely sidetracked in an effort to 'get' the various rules of various manipulations on interval based BFs. Is there a book or paper that you would recommend for this? Like... for intervals centered around 0 it is possible to 'average' the BFs for the two half intervals on either side to arrive at the same BF as for the entire interval, but this logic doesn't seem to hold when averaging two halves of intervals that do not cross 0. The fractions cheat sheets are not entirely getting me there. :)

Thanks again!

richarddmorey commented 6 years ago

You can still average such Bayes factors, you just have to use a weighted average (where the weighting is according to the prior probabilities of the interval). The averaging works in the symmetric case because each half has the same prior probability.

This works because the Bayes factor is based on marginal likelihoods, and the marginal likelihoods are just integrals.

AWKruijt commented 6 years ago

As a final update: thanks again!

It took a small aeon for me to finally figure this out (as I was once again thinking things waaay to complicated), so it felt rather good when it finally clicked \o/ The initial reason I wanted to do this was to assess the probability for the sum of two equidistant intervals at either side of zero but not touching zero (a 'limited hybrid model BF' if you like). I ended up working my solution into a function as follows:

sumIntervalsEitherSideOfZero <- function(t, n, BFtype = "meta.ttestBF", bv1, bv2, rscale){

  bf.mbv2.mbv1 <- as.vector(match.fun(BFtype) (t=t, n1=n, rscale = rscale,
                                  nullInterval=c((-1 * bv2), (-1 * bv1)) ) [1] ) # one side interval

  bf.pbv1.pbv2 <- as.vector(match.fun(BFtype)(t=t, n1=n, rscale = rscale,
                                  nullInterval=c((bv1), (bv2)) ) [1] ) # other side interval

  cp.mbv2.mbv1 <- abs(pcauchy(q= (-1 * bv2) , location = 0, scale = rscale) - 
                pcauchy(q= (-1 * bv1), location = 0, scale = rscale))

  cp.pbv1.pbv2  <- abs(pcauchy(q=(bv1), location = 0, scale = rscale) - 
              pcauchy(q=(bv2), location = 0, scale = rscale))

  BFsumInterval <-  ((cp.mbv2.mbv1 * bf.mbv2.mbv1) + (cp.pbv1.pbv2 * bf.pbv1.pbv2)) / (cp.mbv2.mbv1 + cp.pbv1.pbv2) 
# think this should be right. Note that the 'name' of the named number doesn't convert along (alt., r = etc )

  names(BFsumInterval) <- cat("Alt., r=0.707 -", bv2, "< d < -", bv1, " OR ", bv1, "< d <", bv2, "\n", "Against denominator: Null, d = 0", "\n") 

  print(BFsumInterval)
    }

# BF for intervals of 1-10 ms ( in standardised values!) on either side of zero: 

sumIntervalsEitherSideOfZero(t= MaDF$t_0, n= MaDF$n_total, bv1= (1/wsd), bv2= (10/wsd), rscale = .707)

# a note on interpretation of the sum of the !interval BFs (BF[2]):  I think this represents the summed intervals of -1:1, < -10, and > 10 (!!) vs point nil. Prolly never really useful so therefore not calucating it at all (by requesting [1] in the codes for the bfs)

output: Alt., r=0.707 - 1.353225 < d < - 0.1353225 OR 0.1353225 < d < 1.353225 Against denominator: Null, d = 0 [1] 0.01393067

(for comparison: in my data the BF for the -10:10 interval including the -1:1 part is .31)

This was fun. Thanks :) Not quite sure if I'll end up actually using it, I thought I'd assess such intervals in a post-hoc analysis but the results of the main analysis are such that a post-hoc doesn't really add much. Ah well - I've learned :)

AWKruijt commented 6 years ago

Hi all,

Considered opening a separate thread for this, but I might as well do it here.

I am still struggling with this (did take a break in between ;) ).... and think I've now got precisely (hopefully) one issue of confusion left: but it's a biggy and a lot of confusion... How do I standardize the boundary values for the null-interval?

Without talking you through all to the many different things that I've tried: does anyone happen to just know the exact formula to use? Pooled SD? SD of the study means? Other? Bessel's or other corrections, yes/no?

Would love to hear from you!

Thanks you in advance,

AW

richarddmorey commented 6 years ago

The standardisation is just on delta, where \delta = \deltai = (\mu - \mu_{0i}) / \sigma_i. The model used by meta.ttest assumes that all the effect sizes are equal (so caveat emptor).

AWKruijt commented 6 years ago

I'll take the caveat warning ^.^

Not sure how to understand what you wrote however. To me the quoted line seems to refer to a mu=x scenario, rather than a nullInterval one? I've been thinking all this time is that what I need to figure out is some sort of sigma_m (for the collection of studies, not sigma_i) ?

In the following line: meta.ttestBF(t=ti, n1=ni, nullInterval = c(LB, UB) ) how can I enter meaningful values (other than 0, -Inf, and Inf) for LB and UB ?

all best,

AW

richarddmorey commented 6 years ago

The effect size delta is the answer to "How many population standard deviations is my effect?" (equal variance is assumed). Suppose you're studying IQ, and the "negligible" effects would be 2 IQ points. The SD of IQ is 15, so the negligible effect would be anything less than 2/15 = .13 standardised units. You might therefore choose as an interval (.13, Inf) (non-negligible) or (0,.13) (negligible and positive) or (-.13, 13) negligible, either sign) depending on what you were testing.

AWKruijt commented 6 years ago

That part is all clear and for a 'normal ttestBF' scenario the nullInterval in absolute scores question is easily solved. To go absolutely back to scratch: say that I'd want to assess the ranges of values 47:49, 49:51, and 51:53:

 x <- rnorm(1000, 50, 10) # 1000 points drawn from distribtution with mean = 50, sd =10
 m <- mean(x)
 sd <- sd(x)

ttestBF(x, nullInterval = c( (47/sd), (49/sd)))
ttestBF(x, nullInterval = c( (49/sd), (51/sd)))
ttestBF(x, nullInterval = c( (51/sd), (53/sd)))
m

# most of time the middle one returns the largest bf10 (though sometimes the observed 
# mean falls lower/higher into the 1st or 3rd assessed range and the BFs reflect this.

We can also define mu instead of nullInterval as value/sd:

# a series to check:
ttestBF(x, mu= (m-1/sd))
ttestBF(x, mu= (m/sd)) # we'd expect this line to return the hightest BF10 - which it does
ttestBF(x, mu= (m+1/sd))

But now move to a meta.ttestBF scenario... for starters we need an sd for each datapoint.

 # 3 studies with near identical mean and sd:
 mv <- c(51,50,49)
 sdv <- c(10,8,12)
 nv <- c(20,20,20)

tv <- (mv - 0) / sqrt(sdv^2/nv)
tv  

edit turns out I might a pretty grave error in the above comp of t's - must have messed up something in my scribble scripts. I've corrected the calculation in the line above and will edit the remainder of this post to fit with the values found when t is computed as t and not as d ;)

meta.ttestBF(tv, nv, mu=0) #that works.
# yet setting mu to any other value than 0 (or Inf or -Inf, though those return 
# 'essentially constant') does not work:

 meta.ttestBF(tv, nv, mu=0)
 meta.ttestBF(tv, nv, mu=1)
 meta.ttestBF(tv, nv, mu=10)
# each of the three lines above return the same BF for d=0 (which I think I may have partially      
# noticed a long long time ago but sillily interpreted as the text/message not being updated :/)

edit as commented below, this is as documented. My bad for being in such haste to poste that I did not go back to the documentation to check. It does not alter my question though - it was more a sidetrack.

NullIntervals, however, do seem to work:

 # yet:
meta.ttestBF(tv, nv, nullInterval = c(0,1))
meta.ttestBF(tv, nv, nullInterval = c(1,10))
meta.ttestBF(tv, nv, nullInterval = c(10,100))

Now what if I'd want to define these boundaries by a meaningful value on the scale of my original data? It would seem to require some analogue to (value-0)/sigma. For the 'three study' data generated for this example, the sigma would take a value close to 9.7:

meta.ttestBF(tv, n1=nv,  nullInterval = c( (49/9.6), (51/9.6)))
meta.ttestBF(tv, n1=nv,  nullInterval = c( (49/9.7), (51/9.7)))
meta.ttestBF(tv, n1=nv,  nullInterval = c( (49/9.8), (51/9.8)))

We know that the mean of the study means is 50, and the middle line above returns the highest BF for the 49:51 interval. However: for the life of me I can't figure out how to derive this value 9.7 in any other way than trial-and-erroring. There must be a way to compute it from the available data (m, sd, & n for each record). Sidenote: the middle lines interval translates to c(5.05; 5.25), which given our pooled sd of 10 would seem to reflect something close to 'x sd from 0', except that it would translate in values 51 and 53, rather than 49 and 51 - so something is still off.

edit the above chunk and explanation was heavily edited following the adjusting of the t-val computation- from the first mention of 9.7 onwards)

So... that's where I am (back) at. The ttestbf situation is fully clear and understood, it's the extrapolation to meta.ttestBF that throws me completely (and perhaps just isn't possible at all, though it seems to me that there must be some rule governing the nullInterval values). For what it's worth: trying to move through the actual scripts underlying meta.ttestBF I ended up stuck at a point were the function meta.t.like required a value for log.const, while the line defining log.const relies on running meta.t.like. Snippets:

meta.t.like <- Vectorize(function(delta,t,N,df,rscale=1,log.const=0,log=FALSE,shift=0){
  ans = suppressWarnings(
    sum(dt(t,df,ncp=(delta + shift)*sqrt(N),log=TRUE)) +    
dcauchy(delta+shift,scale=rscale,log=TRUE) - log.const
 )
  if(log){
    return(ans)
  }else{
    return(exp(ans))
  }
},"delta")

and

log.const = meta.t.like(mean.delta,t,N,df,rscale,log=TRUE)

Help?! Thanks :)

richarddmorey commented 6 years ago

I'm not sure what you mean; there is no "mu" argument to meta.ttestBF. That would be confusing, because the input is the t statistic so you don't know the underlying means.

AWKruijt commented 6 years ago

Good point regarding mu in metattestBF. Apologies for causing confusion. I ‘discovered’ the mu does not work for meta.ttestBF thing only this morning (like.. right before you answered, while I was trying all options in normal ttestBF to then extrapolate to meta.ttestBF to figure out ultimately how to define NullInterval as a function of an absolute value - I have no use for defining mu, it was just part of me trying to make sense of things). I clearly didn’t think to go back to the documentation to remember that mu isn’t an option in meta.ttestBF.

The thing is: I do know the underlying means… for my ‘real data’ the raw means and sds are the info that I have extracted from the studies (while the t will typically not be reported). I compute the t’s for the sake of inputting them in meta.ttestBF. Which is also why I would want to be able to define NullInterval in raw mean values… it’s easier to think about this data/the effect sizes in ‘units’ than in ‘t’ form.

richarddmorey commented 6 years ago

It sounds like meta.ttestBF is not the function for you then; I'd recommend trying to define the model in JAGS or STAN and analyzing it that way. meta.ttestBF is really meant for when you only have the test statistics from a paper and can't work with the raw data.

AWKruijt commented 6 years ago

Well.. one would starting to think so, given the ridiculous hours I've spent on it by now and how close to insanity the whole thing has pushed me few time more than I wished for ;)

But... I think I cracked it :)

It will need some cleaning up I guess but the answer I was looking for turns out to be:

 d <- sum(((tv/sqrt(nv)) * nv)/sum(nv))  # value of d
 bvd <- sum(mv*nv/sum(nv) ) /d # boundary value denominator. 

To show it with the same simmed data (which had the mystery value of circa 9.7) as used above:

# set of 3 studies with near identical mean and sd:
mv <- c(51,50,49)
sdv <- c(10,8,12)
nv <- c(20,20,20)

tv <- (mv - 0) / sqrt(sdv^2/nv)
tv  

d <- sum(((tv/sqrt(nv)) * nv)/sum(nv))
d # 5.1445 
# notice that our mean (50/d) == 9.719

bvd <- sum(mv*nv/sum(nv) ) /d  # boundary value denominator is observed n-weighted mean/d

# series to check: 
meta.ttestBF(tv, n1=nv, nullInterval = c( (47/bvd), (49/bvd)))
meta.ttestBF(tv, n1=nv, nullInterval = c( (49/bvd), (51/bvd)))
meta.ttestBF(tv, n1=nv, nullInterval = c( (51/bvd), (53/bvd)))

# notice that the d boundaries outputted for the middle line are precisely centered around d 
# while we inputted d with raw values (49 and 51) that were centered around the observed 
# mean 50
 d - 5.04155555555556 # = 0.1028889
 d - 5.24733333333333 # = -0.1028889

The nut was cracked by going back into the meta.ttestBF's code - were I should have returned long ago instead of trying to zoom in on my answer through systematic simulations. The magic lines are these:

## Savage-Dickey using t approximation to posterior of delta
delta.est = t/sqrt(N)
mean.delta = sum((delta.est * N)/sum(N))
var.delta = 1/sum(N)

logPriorProbs = pcauchy(c(upper,lower),scale=rscale,log.p=TRUE)
logPostProbs = pt((c(upper,lower) - mean.delta)/sqrt(var.delta),sum(df),log.p=TRUE)

Comments and double-check of this solution is very welcome. I think/hope it's watertight as it is a direct back-construcing of the BayesFactor code. I hope is is, because I have a manuscript relying on this to work (nearly) ready to send out for a couple of weeks now. An hour ago I was seriously contemplating throwing this part out and/or replace it for a TOST based procedure but I think this will work sufficiently well now, after all (and providing checks that I'll do tomorrow all pan out). This makes me happy \o/

So.. and now I think I'll have a drink ;)

Cheers!