Closed bwiernik closed 3 years ago
library(boot)
library(parameters)
model <- lm(Sepal.Length ~ Species * Petal.Width, data = iris)
set.seed(2)
bootstrap_parameters(model)
#> # Fixed Effects
#>
#> Parameter | Coefficient | 95% CI | p
#> --------------------------------------------------------------------
#> (Intercept) | 4.78 | [ 4.50, 5.00] | < .001
#> Speciesversicolor | -0.72 | [-1.62, 0.08] | 0.094
#> Speciesvirginica | 0.50 | [-0.67, 1.65] | 0.415
#> Petal.Width | 0.91 | [ 0.22, 1.97] | 0.039
#> Speciesversicolor:Petal.Width | 0.50 | [-0.66, 1.52] | 0.393
#> Speciesvirginica:Petal.Width | -0.27 | [-1.36, 0.67] | 0.545
set.seed(2)
bootstrap_parameters(model, ci_method = "bci")
#> # Fixed Effects
#>
#> Parameter | Coefficient | 95% CI | p
#> --------------------------------------------------------------------
#> (Intercept) | 4.78 | [ 4.49, 4.99] | < .001
#> Speciesversicolor | -0.72 | [-1.66, 0.06] | 0.094
#> Speciesvirginica | 0.50 | [-0.71, 1.61] | 0.415
#> Petal.Width | 0.91 | [ 0.29, 2.07] | 0.039
#> Speciesversicolor:Petal.Width | 0.50 | [-0.78, 1.46] | 0.393
#> Speciesvirginica:Petal.Width | -0.27 | [-1.39, 0.58] | 0.545
Created on 2021-05-27 by the reprex package (v2.0.0)
😮
Daniel is the Flash of supporting feature requests! âš¡
Not sure how to adjust p-values, though.
The current implantation is here:
https://github.com/easystats/parameters/blob/main/R/methods_base.R#L143-L157
But this method assumes translation equivariance: that shape of the sampling distribution is unaffected by the null being true or not.... Which is probably never really true anyway? A true p-value would require a permutation test... Otherwise, I would suggest dropping p-values entirely.
The approach I would use is to compute a wide range of confidence limits and interpolate at which limit the interval crosses zero. See e.g., https://www.stat.umn.edu/geyer/old/5601/examp/tests.html. This could be combined with optim()
to find the precise limit if desired. This approach is consistent with the confidence distribution interpretation of p values.
Edit: example:
library(bootstrap)
set.seed(1234)
x <- mtcars$am
y <- mtcars$carb
print(theta.hat <- cor(x, y))
#> [1] 0.05753435
my.cor <- function(k) cor(x[k], y[k])
bca.out <- bcanon(seq(along = x), nboot = 10000,
theta = my.cor, alpha = seq(0.001, 0.999, 0.001))
plot(bca.out$confpoints[ , 1], bca.out$confpoints[ , 2],
xlab = "alpha", ylab = "bca point")
abline(h = 0)
## lower-tailed test of theta = 0
ltpv <- approx(bca.out$confpoints[ , 2],
bca.out$confpoints[ , 1], xout = 0)$y
#> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
#> collapsing to unique 'x' values
ltpv
#> [1] 0.378
## upper-tailed test of theta = 0
1 - ltpv
#> [1] 0.622
## upper-tailed test of theta = 0
2 * min(ltpv, 1 - ltpv)
#> [1] 0.756
Created on 2021-05-27 by the reprex package (v2.0.0)
@strengejacke what's your thinking on calling the method "bci"
instead of "bcai"
?
@strengejacke what's your thinking on calling the method
"bci"
instead of"bcai"
?
Just the three letters. ;-) We may also rename it into bcai()
. @DominiqueMakowski any preferences here?
Interesting...
Here's my version (that uses the bootstrapped distribution we already have in the output):
p_value_distrib <- function(x, null = 0){
get_ci <- function(alpha, d) {
alpha <- ifelse(alpha > 1, 1,
ifelse(alpha < 0, 0,
alpha))
q <- quantile(c(d, null), probs = c(alpha/2, 1 - alpha/2))
min(abs(q - null))
}
optim(0.5, fn = get_ci, d = x)[["par"]]
}
set.seed(444)
dat <- data.frame(x = rnorm(20),
y = rnorm(20))
m <- lm(y ~ x, dat)
b <- parameters::bootstrap_parameters(m)
parameters::model_parameters(m)$p[2]
#> [1] 0.1572755
b$p[2]
#> [1] 0.1308691
attr(b, "boot_samples")[[2]] |> p_value_distrib()
#> [1] 0.17
Created on 2021-05-27 by the reprex package (v2.0.0)
Does this make sense?
How would be support BCa and other methods this way?
@strengejacke what's your thinking on calling the method
"bci"
instead of"bcai"
?
you can now use both "bci" and "bcai", and bcai()
was added as alias.
For models that support them, it would be nice to include BCa intervals in addition to the currently supported percentile intervals. For completeness, we might also want Studentized or Basic bootstrap, but I see less value in those.