Open SebKrantz opened 1 year ago
Oh, that's super cool! Thanks for pointing me to it. I will move to qtab()
with the next release. Thanks! =)
Yes, it indeed gives good speed up:
library(microbenchmark)
N <- 100000
a <- sample(1:20, N, replace = TRUE)
b <- sample(1:300, N, replace = TRUE)
y <- matrix(rnorm(N), N, 1)
data <- y
var1 <- data.frame(a = a)
var2 <- data.frame(b = b)
microbenchmark(
ct1 <- fwildclusterboot:::crosstab(data = data, var1 = var1, var2 = var2),
ct4 <- fwildclusterboot:::crosstab4(data = data, var1 = var1, var2 = var2),
ct5 <- fwildclusterboot:::crosstab_qtab(data = data, var1 = var1, var2 = var2),
times = 1
)
# min lq mean median uq max
# 5.353802 5.353802 5.353802 5.353802 5.353802
# 225.850701 225.850701 225.850701 225.850701 225.850701
# 3.371901 3.371901 3.371901 3.371901 3.371901
Nice! :)
qtab()
introduced in collapse 1.8.0 should be more efficient than the workaround with fsum(). You need to pass the data to the weights argumentw
ofqtab()
. In any case, even if that should not be the case,fsum()
now has an argumentfill = TRUE
, which you can set to avoid theres[is.na(res)] <- 0
line.