Closed jhmaindonald closed 1 year ago
The following demonstrates the discontinuity when using ttestBF()
d <- c(1.6,2.75,1.68,1.65,0.37,1.36,2.16,1.16,4.97,1.74)
ttestBF(d-0.001, n1=10, nullInterval=c(-.1,.1))
# . . .
# [1] Alt., r=0.707 -0.1<d<0.1 : 1.107262 ±0%
# [2] Alt., r=0.707 !(-0.1<d<0.1) : 57.53165 ±0%
# . . .
ttestBF(d, n1=10, nullInterval=c(-.1,.1))
# t is large; approximation invoked.
# t is large; approximation invoked.
# . . .
# [1] Alt., r=0.707 -0.1<d<0.1 : 0.1948791 ±NA%
# [2] Alt., r=0.707 !(-0.1<d<0.1) : 57.79087 ±NA%
# . . .
I accidentally closed this.
Thanks for the report, I'll check it out.
This is due to the approximation being particularly terrible with low N. I can introduce a better way of deciding when to evoke the approximation, because it also might not be needed for small n with t statistics as small as 5.
I do think there should be agreement at the point of cross-over to the approximation. Even for n=300, the approximation is not great, for a null interval of (-0.01, 0.01) or (-0.1, 0.1):
Null = (-0.01, 0.01) Null = (-0.1, 0.1)
t n null.01 alt.01 bfInt.01 null.1 alt.1 bfInt.1
5.00 3 1.000 3.039 3.039 1.009 3.218 3.190
5.01 3 0.083 3.054 36.987 0.083 3.317 39.823
5.00 30 1.006 909.768 903.934 1.738 989.973 569.451
5.01 30 0.387 933.132 2409.182 0.555 1015.512 1830.829
5.00 300 1.114 9565.781 8588.488 82.147 10402.732 126.635
5.01 300 0.944 10013.835 10603.656 64.850 10892.063 167.957
5.00 3000 2.582 5228.233 2024.726 39701.764 1790.458 0.045
5.01 3000 2.533 5494.020 2168.572 41383.758 1914.511 0.046
It would be straightforward to apply an ad hoc adjustment that would make no change asymptotically. This would make best sense if somehow related to perhaps the third and fourth moments of relevant distributions, or to other such statistics.
I notice that the BayesTools package, which has its own implementation of the Savage-Dickey approximation, includes in its ComparisonR vignette a comparison with use of the BayesFactor package. It is not obvious to me how to do an equivalent comparison for the type of null interval comparison I've done, but I do wonder whether there may be useful ideas there.
Code for the main part of the above output is:
library(BayesFactor)
t2bfInterval <- function(t, n=10, rscale="medium", mu=c(-.1,.1)){
null0 <- BayesFactor::ttest.tstat(t=t, n1=n, nullInterval=mu,
rscale=rscale,simple=TRUE)
alt0 <- BayesFactor::ttest.tstat(t=t, n1=n, nullInterval=mu,
rscale=rscale, complement=TRUE, simple=TRUE)
c(null0, alt0, alt0/null0)
}
##
tval <- c(5,5.01)
nval <- c(3,30,300,3000)
bfDF <- expand.grid(t=tval, n=nval)
ncol <- 2; tcol <- 1
other <- apply(bfDF,1,function(x)
c(t2bfInterval(t=x[tcol], n=x[ncol], mu=c(-0.01,0.01)),
t2bfInterval(t=x[tcol], n=x[ncol], mu=c(-0.1,0.1))
))
setNames(cbind(bfDF, t(other)), c('t', 'n', 'null.01', 'alt.01', 'bfInt.01', 'null.1', 'alt.1', 'bfInt.1'))
I've pushed the commit that makes the invocation of the approximation much rarer. The new code only uses the approximation as a fallback if the quadrature fails. I agree in principle that the a smoother transition to the approximation is preferable, but that would be a bit tricky given that you'd need to work out, e.g., the closest t statistic where the quadrature doesn't fail, which would be different for different Ns and r scales. Perhaps the answer is just a better approximation.
The approximation does indeed now seem to be invoked only in quite extreme cases.
Note, however, that there can now be an issue for mu=0.
BFInfo()
Package BayesFactor
0.9.12-4.5
t2bfInterval <- function(t, n=10, rscale="medium", mu=c(-.1,.1)){
null0 <- BayesFactor::ttest.tstat(t=t, n1=n, nullInterval=mu,
rscale=rscale,simple=TRUE)
alt0 <- BayesFactor::ttest.tstat(t=t, n1=n, nullInterval=mu, rscale=rscale,
complement=TRUE, simple=TRUE)
alt0/null0
}
##
## Calculate Bayes factors
pval <- c(1e-7,1e-8,1e-12); nval <- c(3,4)
bfDF <- expand.grid(p=pval, n=nval)
pcol <- 1; ncol <- 2; tcol <- 3
bfDF[,'t'] <- apply(bfDF,1,function(x){qt(x[pcol]/2, df=x[ncol]-1, lower.tail=FALSE)})
other <- apply(bfDF,1,function(x)
BayesFactor::ttest.tstat(t=x[tcol], n1=x[ncol], rscale="medium",
simple=TRUE))
bfDF <- cbind(bfDF, BF=other)
Error in integrate(meta.t.like, lower = (lower - mean.delta)/scale.delta, :
roundoff error is detected in the extrapolation table
Error in integrate(meta.t.like, lower = (lower - mean.delta)/scale.delta, :
the integral is probably divergent
Error in integrate(meta.t.like, lower = (lower - mean.delta)/scale.delta, :
roundoff error is detected in the extrapolation table
round(setNames(cbind(bfDF, t(other)), c('p','n','t','bf','bfInterval')),1)
p n t bf bfInterval NA NA
round(bfDF,1)
p n t BF
1 0 3 3162.3 16139.8
2 0 3 10000.0 NA
3 0 3 1000000.0 NA
4 0 4 280.4 18153.0
5 0 4 604.2 114927.2
6 0 4 13016.4 NA
Certainly, there is an issue of how meaningful a Bayes Factor is for such large values of t!
The code is not easy to find one's way around. A note on ways into the code for anyone who wants to experiment with the settings would be helpful.
BayesFactor Null Interval Large t Approximation.pdf Try the following