Closed roaldarbol closed 2 months ago
The variance components for the beta-model don't look that bad, actually.
library(glmmTMB)
library(performance)
df_ratio_episode <- data.frame(
animal_id = factor(
rep(
c(
"208", "209", "210", "214", "223", "228", "211", "213", "217", "222", "234",
"241", "216", "230", "231", "240", "242", "244", "218", "220", "225", "237",
"239", "219", "251", "252", "253", "254"
),
each = 2L
),
levels = c(
"200", "204", "205", "206", "215", "224", "208", "209", "210", "214", "223",
"228", "211", "213", "217", "222", "234", "241", "216", "230", "231", "240",
"242", "244", "218", "220", "225", "237", "239", "245", "219", "236", "251",
"252", "253", "254"
)
),
trial = rep(c(1, 2), 28),
activity_ratio = c(
0.1313027016689785, 0.08387917431645128, 0.1395420340967623,
0.09844057594710427, 0.19414443290359096, 0.16304581176275632,
0.17274983272168504, 0.17357956037939837, 0.09729583968716982,
0.05138063319955499, 0.14298075594540044, 0.10179701101266003,
0.09168390375802275, 0.11591243874797318, 0.2521345405747349,
0.16335726666875724, 0.13436311090275369, 0.12012636336085161,
0.13868852567209072, 0.12008249718946021, 0.27708418835127824,
0.22042035159734397, 0.2649703945513039, 0.22158610629846917,
0.2001770607989554, 0.2238562351804714, 0.1105503693420828,
0.08255349183783911, 0.21927303214082697, 0.22211274055043914,
0.10446530203550744, 0.11336175801811256, 0.0826812722435201,
0.09328851878674252, 0.13701773797551595, 0.1297098120849381,
0.05986226055235673, 0.14423247009476106, 0.19474645802355026,
0.1713563584485577, 0.25663498351317365, 0.30249307043720924,
0.09082761877930186, 0.10402396536249521, 0.21941679494558652,
0.28459112981037343, 0.11218161441362348, 0.12449715062493952,
0.18427917423975973, 0.14845015830783756, 0.19444224064643065,
0.13471565660441723, 0.11247341287367296, 0.08660523675310272,
0.1763980204528711, 0.1049572229068965
)
)
glmm_ratio_beta <- glmmTMB::glmmTMB(activity_ratio ~ trial + (1 | animal_id),
data = df_ratio_episode,
family = glmmTMB::beta_family
)
insight::get_variance(glmm_ratio_beta)
#> $var.fixed
#> [1] 0.003054337
#>
#> $var.random
#> [1] 0.160573
#>
#> $var.residual
#> [1] 1.92015
#>
#> $var.distribution
#> [1] 1.92015
#>
#> $var.dispersion
#> [1] 0
#>
#> $var.intercept
#> animal_id
#> 0.160573
glmm_ratio_gauss <- glmmTMB::glmmTMB(activity_ratio ~ trial + (1 | animal_id),
data = df_ratio_episode,
family = gaussian
)
insight::get_variance(glmm_ratio_gauss)
#> $var.fixed
#> [1] 4.886653e-05
#>
#> $var.random
#> [1] 0.00282474
#>
#> $var.residual
#> [1] 0.0007858734
#>
#> $var.distribution
#> [1] 0.0007858734
#>
#> $var.dispersion
#> [1] 0
#>
#> $var.intercept
#> animal_id
#> 0.00282474
Created on 2024-07-05 with reprex v2.1.0
The residual variance seems suspiciously high to me though. 🤔
We had this discussion before, but I can't find it right now. There seems to be a mismatch between the variance-function from glmmTMB::beta_family()
:
> family(mod1)$variance
function (mu)
{
mu * (1 - mu)
}
and what the docs suggest:
V=μ(1−μ)/(ϕ+1)
(which is mu * (1 - mu) / (1 + phi)
)
The code base in insight is inconclusive, as well:
# Get distributional variance for beta-family
# ----------------------------------------------
.variance_family_beta <- function(model, mu, phi) {
stats::family(model)$variance(mu)
# was:
# mu * (1 - mu) / (1 + phi)
# but that code is not what "glmmTMB" uses for the beta family
# mu * (1 - mu)
}
Tagging @bbolker
It's hard to find something to validate against. The current implementation in insight and performance yields results similar to betareg for your example in https://github.com/easystats/insight/issues/664:
library(glmmTMB)
set.seed(123)
a <- seq(from = 5, to = 95)
b1 <- jitter(a, factor = 20)
b2 <- jitter(a, factor = 20)
b2 <- b2 + 30
b3 <- jitter(a, factor = 20)
b3 <- b3 + 30
t_a <- rep('a', length(a))
t_b <- rep('b', length(a))
c <- as.factor(a)
d <- data.frame(id = c(c,c,c,c),
value = c(a,b1,b2,b3),
treatment = c(t_a, t_a, t_b, t_b))
d$value <- d$value / (max(d$value) + 0.1)
m <- glmmTMB::glmmTMB(value ~ treatment,
data = d,
family=beta_family)
performance::r2_nakagawa(m)
#> Random effect variances not available. Returned R2 does not account for random effects.
#> # R2 for Mixed Models
#>
#> Conditional R2: NA
#> Marginal R2: 0.289
m2 <- betareg::betareg(value ~ treatment, data = d)
summary(m2)$pseudo.r.squared
#> [1] 0.2434683
The next results (including id
) differ, seems to be too high for betareg?
m <- glmmTMB::glmmTMB(value ~ treatment + (1 | id),
data = d,
family=beta_family)
performance::r2_nakagawa(m) # 0.685
m2 <- betareg::betareg(value ~ treatment + id, data = d)
summary(m2)$pseudo.r.squared # 0.940
But... for the example in this issue, using mu * (1 - mu) / (1 + phi)
seems to fit better than the current implementation in insight (which is mu * (1 - mu)
).
I actually think the betareg version is on point - it helps to plot it out, and here you can really see that the ICC should be super high for the first example!
library(glmmTMB)
#> Warning in checkDepPackageVersion(dep_pkg = "TMB"): Package version inconsistency detected.
#> glmmTMB was built with TMB version 1.9.11
#> Current TMB version is 1.9.14
#> Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(performance)
library(ggplot2)
library(paletteer)
set.seed(123)
a <- seq(from = 5, to = 95)
b1 <- jitter(a, factor = 20)
b2 <- jitter(a, factor = 20)
b2 <- b2 + 30
b3 <- jitter(a, factor = 20)
b3 <- b3 + 30
t_a <- rep('a', length(a))
t_b <- rep('b', length(a))
c <- as.factor(a)
d <- data.frame(id = c(c,c,c,c),
value = c(a,b1,b2,b3),
treatment = c(t_a, t_a, t_b, t_b))
d$value <- d$value / (max(d$value) + 0.1)
d |>
ggplot(aes(x = forcats::fct_reorder(id, value, .desc = TRUE),
y = value)) +
geom_line(aes(group = id), alpha = 0.5) +
geom_point() +
theme(panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.x = element_blank()
) +
facet_grid(~ treatment)
Created on 2024-07-05 with reprex v2.1.0
Did some more digging. :-) It seems that for the Gaussian family, the conditional R2 of a model without random effects pretty much equates the Adjusted ICC for a random effects model. For the beta family, that's not the case.
library(glmmTMB)
#> Warning in checkDepPackageVersion(dep_pkg = "TMB"): Package version inconsistency detected.
#> glmmTMB was built with TMB version 1.9.11
#> Current TMB version is 1.9.14
#> Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(performance)
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.21.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#>
#> Attaching package: 'brms'
#> The following object is masked from 'package:glmmTMB':
#>
#> lognormal
#> The following object is masked from 'package:stats':
#>
#> ar
library(betareg)
set.seed(123)
a <- seq(from = 5, to = 95)
b1 <- jitter(a, factor = 20)
b2 <- jitter(a, factor = 20)
b2 <- b2 + 30
b3 <- jitter(a, factor = 20)
b3 <- b3 + 30
t_a <- rep('a', length(a))
t_b <- rep('b', length(a))
c <- as.factor(a)
d <- data.frame(id = c(c,c,c,c),
value = c(a,b1,b2,b3),
treatment = c(t_a, t_a, t_b, t_b))
d$value <- d$value / (max(d$value) + 0.1)
# GLMMs and GLMs with beta and Gaussian families
glmm_beta <- glmmTMB::glmmTMB(value ~ treatment + (1|id),
data = d,
family=beta_family)
glm_beta <- betareg(value ~ treatment + id,
data = d)
glmm_gauss <- glmmTMB::glmmTMB(value ~ treatment + (1|id),
data = d,
family = gaussian)
glm_gauss <- glm(value ~ treatment + id,
data = d,
family = gaussian)
icc(glmm_beta)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.619
#> Unadjusted ICC: 0.511
r2(glmm_beta)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.685
#> Marginal R2: 0.174
r2(glm_beta)
#> # R2 for Beta Regression
#> Pseudo R2: 0.941
icc(glmm_gauss)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.995
#> Unadjusted ICC: 0.749
r2(glmm_gauss)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.996
#> Marginal R2: 0.247
r2(glm_gauss)
#> R2: 0.997
Created on 2024-07-05 with reprex v2.1.0
A side-note, might be for a different issue - it seems that variance_decomposition()
resembles the Unadjusted ICC
rather than the Adjusted ICC (compare the Gaussian examples above and below) - it would be great to have both Adjusted and Unadjusted.
bayes_beta <- brms::brm(value ~ treatment + (1|id),
data = d,
family=Beta)
#> Compiling Stan program...
#> I just removed all the compiling for convenience....
bayes_beta_fe <- brms::brm(value ~ treatment + id,
data = d,
family = Beta)
#> Compiling Stan program...
#> I just removed all the compiling for convenience....
variance_decomposition(bayes_beta)
#> # Random Effect Variances and ICC
#>
#> Conditioned on: all random effects
#>
#> ## Variance Ratio (comparable to ICC)
#> Ratio: 0.61 CI 95%: [0.58 0.65]
#>
#> ## Variances of Posterior Predicted Distribution
#> Conditioned on fixed effects: 0.02 CI 95%: [0.02 0.03]
#> Conditioned on rand. effects: 0.06 CI 95%: [0.06 0.06]
#>
#> ## Difference in Variances
#> Difference: 0.04 CI 95%: [0.03 0.04]
r2(bayes_beta)
#> # Bayesian R2 with Compatibility Interval
#>
#> Conditional R2: 0.979 (95% CI [0.976, 0.981])
#> Marginal R2: 0.326 (95% CI [0.311, 0.341])
r2(bayes_beta_fe)
#> # Bayesian R2 with Compatibility Interval
#>
#> Conditional R2: 0.979 (95% CI [0.976, 0.981])
variance_decomposition(bayes_gauss)
#> # Random Effect Variances and ICC
#>
#> Conditioned on: all random effects
#>
#> ## Variance Ratio (comparable to ICC)
#> Ratio: 0.75 CI 95%: [0.74 0.76]
#>
#> ## Variances of Posterior Predicted Distribution
#> Conditioned on fixed effects: 0.01 CI 95%: [0.01 0.01]
#> Conditioned on rand. effects: 0.06 CI 95%: [0.06 0.06]
#>
#> ## Difference in Variances
#> Difference: 0.04 CI 95%: [0.04 0.04]
r2(bayes_gauss)
#> # Bayesian R2 with Compatibility Interval
#>
#> Conditional R2: 0.996 (95% CI [0.996, 0.996])
#> Marginal R2: 0.247 (95% CI [0.242, 0.252])
r2(bayes_gauss_fe)
#> # Bayesian R2 with Compatibility Interval
#>
#> Conditional R2: 0.996 (95% CI [0.996, 0.996])
betareg computes the R2 as power of correlation between eta and linkfun(y) (cor(eta, linkfun(y))^2
). I'm not sure if this always translates perfect to the mixed models scenario.
This is definitely a definition of R^2, closest to Efron's pseudo-R^2 (although Efron computes correlation on the response scale before squaring it ...) It would be interesting to see where betareg
got that variant from ... I haven't found it in any of my pseudo-R^2 refs (Wikipedia, UCLA, my class notes (although not sure what source I used for the notes — doesn't seem to be in the textbook I used ...))
A quick search in the related publication (https://www.jstatsoft.org/article/view/v034i02) has only one match for "R2". Maybe elsewhere it's described in more detail.
May I also point out to this comment: https://github.com/easystats/performance/issues/742#issuecomment-2210936164
What's your opinion on this?
But what would be "the linear predictor for the mean"? Maybe that would help improving insight::get_variance()
.
Edit: found this
@strengejacke For the uninitiated, could you just write a line or two about why we talk about R2 almost exclusively when the issue is on ICC - I imagine there's a mathematical connection is implicit, but it's not entirely clear. :-) Also, feel free to change the issue title to reflect that it's an issue that affects both ICC and R2 for the beta family if they're two parts of the same puzzle. 😊
A bit out of order in the thread, but responding to the comment above about the mismatch between the variance function (mu*(1-mu)
) and the conditional variance of the distribution (mu * (1 - mu) / (1 + phi)
? glmmTMB::beta_family
says:
variance: a function of either 1 (mean) or 2 (mean and dispersion parameter) arguments giving a value proportional to the predicted variance (scaled by ‘sigma(.)’)
This is for consistency with the way that $variance
is implemented/interpreted in stats::glm()
. I think I've raised this point before ...
Thanks for clarification!
I think I've raised this point before ...
Yes, definitely, it's just I couldn't find the thread/post. :-/
@strengejacke For the uninitiated, could you just write a line or two about why we talk about R2 almost exclusively when the issue is on ICC - I imagine there's a mathematical connection is implicit, but it's not entirely clear. :-) Also, feel free to change the issue title to reflect that it's an issue that affects both ICC and R2 for the beta family if they're two parts of the same puzzle. 😊
Sure. ICC is relevant for mixed models only. The ICC is calculated by dividing the random effect variance, σ2i, by the total variance, i.e. the sum of the random effect variance and the residual variance, σ2ε.
The R2 for mixed models are calculated this way:
The main point is that for both ICC and R2 in mixed models, we need the different variance components. That's why both are related.
Thanks for the clarification, that helps.
Just tested out the latest changes - Ferrari's R2 looks more in line with what I'd expect. Will this eventually change the ICC too? A few notes on consistency as of the latest commit:
glmmTMB
and glm
with beta_family
is not Ferrari (IDK what is default for glmm and it's Nagelkerke's R2 for glm - which gives NaN
)glmmTMB
and beta_family
compared to betareg
glmmTMB
with beta_family
is wrongbetareg
R2 does nothingglm
with beta_family
failslibrary(glmmTMB)
#> Warning in checkDepPackageVersion(dep_pkg = "TMB"): Package version inconsistency detected.
#> glmmTMB was built with TMB version 1.9.11
#> Current TMB version is 1.9.14
#> Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(performance)
library(betareg)
set.seed(123)
a <- seq(from = 5, to = 95)
b1 <- jitter(a, factor = 20)
b2 <- jitter(a, factor = 20)
b2 <- b2 + 30
b3 <- jitter(a, factor = 20)
b3 <- b3 + 30
t_a <- rep('a', length(a))
t_b <- rep('b', length(a))
c <- as.factor(a)
d <- data.frame(id = c(c,c,c,c),
value = c(a,b1,b2,b3),
treatment = c(t_a, t_a, t_b, t_b))
d$value <- d$value / (max(d$value) + 0.1)
# Specify models
glmm_beta <- glmmTMB::glmmTMB(value ~ treatment + (1|id),
data = d,
family=beta_family)
glm_betareg <- betareg(value ~ treatment + id,
data = d)
glm_betafamily <- glm(value ~ treatment + id,
data = d,
family = beta_family)
glmm_gauss <- glmmTMB::glmmTMB(value ~ treatment + (1|id),
data = d,
family = gaussian)
glm_gauss <- glm(value ~ treatment + id,
data = d,
family = gaussian)
# ICC
icc(glmm_beta)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.619
#> Unadjusted ICC: 0.511
icc(glmm_gauss)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.995
#> Unadjusted ICC: 0.749
# ICC Bootstrapped
icc(glmm_beta, method = "boot", ci = 0.95)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.619 [1.000, 1.000]
#> Unadjusted ICC: 0.511 [0.660, 0.785]
icc(glmm_gauss, method = "boot", ci = 0.95)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.995 [0.993, 0.996]
#> Unadjusted ICC: 0.749 [0.693, 0.795]
# R2 - default
r2(glmm_beta)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.685
#> Marginal R2: 0.174
r2(glm_betareg)
#> # R2 for Beta Regression
#> Pseudo R2: 0.941
r2(glm_betafamily)
#> # R2 for Generalized Linear Regression
#> Nagelkerke's R2: NaN
r2(glmm_gauss)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.996
#> Marginal R2: 0.247
r2(glm_gauss)
#> R2: 0.997
# R2 - Ferrari
r2_ferrari(glmm_beta)
#> # R2 for Generalized Linear Regression
#> Ferrari's R2: 0.941
r2_ferrari(glm_betareg)
#> # R2 for Generalized Linear Regression
#> Ferrari's R2: 0.941
r2_ferrari(glm_betafamily)
#> # R2 for Generalized Linear Regression
#> Ferrari's R2: 0.929
# R2 Bootstrapped
r2(glmm_beta, method = "boot", ci = 0.95)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.685 [1.000, 1.000]
#> Marginal R2: 0.174 [0.214, 0.326]
r2(glm_betareg, method = "boot", ci = 0.95)
#> # R2 for Beta Regression
#> Pseudo R2: 0.941
r2(glm_betafamily, method = "boot", ci = 0.95)
#> Error in stats::integrate(.dRsq, i, j, R2_pop = dots$R2_pop, R2_obs = dots$R2_obs, : non-finite function value
r2(glmm_gauss, method = "boot", ci = 0.95)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.996 [0.995, 0.997]
#> Marginal R2: 0.247 [0.198, 0.302]
r2(glm_gauss, method = "boot", ci = 0.95)
#> R2: 0.997 [0.995, 0.997]
Created on 2024-07-08 with reprex v2.1.0
and the conditional variance of the distribution (
mu * (1 - mu) / (1 + phi)
@bbolker: That means, in insight::get_variance()
, the and conditional variance of the distribution (mu * (1 - mu)) / (1 + phi)
should be used?
Yes, the conditional variance should definitely be divided by (1+phi)
library(glmmTMB)
set.seed(101)
dd <- data.frame(x=rnorm(100))
## (I was originally planning to use an x covariate, now ignored)
dd$y <- simulate_new(~ 1, family = beta_family, newdata = dd,
newparams = list(beta = c(1), betadisp = 2))[[1]]
m <- glmmTMB(y~1, family = beta_family, data = dd)
mu <- predict(m, type = "response")
var1 <- family(m)$var(mu)/(1+sigma(m)) ## 0.0187
var(dd$y) ## 0.0193
What about orderedbeta family? mu * (1 - mu)
or also dividing by (1+phi)
?
Currently the default R2 for a glmmTMB and glm with beta_family is not Ferrari (IDK what is default for glmm and it's Nagelkerke's R2 for glm - which gives NaN)
It's now implemented for glm (see https://github.com/easystats/insight/commit/887bb59acfd3521131db418ea508838b600b340d). For glmmTMB, r2_ferrari()
is only used for non-mixed models. Mixed models always use r2_nakagawa()
by default when r2()
is called.
The Ferrari R2 is slightly different for a glmmTMB and beta_family compared to betareg
Only if you fit a mixed model for glmmTMB. For non-mixed models, results are identical.
Bootstrapped ICC and R2 for glmmTMB with beta_family is wrong
Yeah, not sure about this one. Will look into it.
Bootstrapping for betareg R2 does nothing. Bootstrapping for glm with beta_family fails
Bootstrapping is not implemented for R2 for non-mixed models.
You should be able to update insight, which now finally is supposed to return the correct ICC/R2 for mixed models from the beta-family.
The Ferrari R2 is slightly different for a glmmTMB and beta_family compared to betareg
Only if you fit a mixed model for glmmTMB. For non-mixed models, results are identical.
Actually the other way around - the mixed model matches the betareg model, the GLM is different. See the above (quoted below).
# R2 - Ferrari r2_ferrari(glmm_beta) #> # R2 for Generalized Linear Regression #> Ferrari's R2: 0.941 r2_ferrari(glm_betareg) #> # R2 for Generalized Linear Regression #> Ferrari's R2: 0.941 r2_ferrari(glm_betafamily) #> # R2 for Generalized Linear Regression #> Ferrari's R2: 0.929
You should be able to update insight, which now finally is supposed to return the correct ICC/R2 for mixed models from the beta-family.
Works really well now!!! 🥳🥳🥳
library(glmmTMB) set.seed(101) dd <- data.frame(x=rnorm(100)) ## (I was originally planning to use an x covariate, now ignored) dd$y <- simulate_new(~ 1, family = beta_family, newdata = dd, newparams = list(beta = c(1), betadisp = 2))[[1]] m <- glmmTMB(y~1, family = beta_family, data = dd) mu <- predict(m, type = "response") var1 <- family(m)$var(mu)/(1+sigma(m)) ## 0.0187 var(dd$y) ## 0.0193
Two questions: does this work for you? I get:
library(glmmTMB)
set.seed(101)
dd <- data.frame(x=rnorm(100))
## (I was originally planning to use an x covariate, now ignored)
dd$y <- simulate_new(~ 1, family = beta_family, newdata = dd,
newparams = list(beta = c(1), betadisp = 2))[[1]]
#> Error in eval(family$initialize): y values must be 0 < y < 1
Second is https://github.com/easystats/performance/issues/742#issuecomment-2214719960.
$variance
and definition of sigma()
as beta (just added a note to ?sigma.glmmTMB
): see https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/src/glmmTMB.cpp#L733-L737, but I think the actual conditional variances will not be the same, because the parameterization is relative to the underlying/baseline beta distribution, not counting the modifications created by moving to the ordered-beta specification. You might be able to work this out by hand (by analogy with the variances of ZI models, hurdle models, etc. etc.), or maybe look in Kubinec's paper/ask him if he knows the answer??For the confidence intervals, it seems that none of the ci_method
s work (neither default, boot
or analytical
).
icc(glmm_beta, ci = 0.95)
icc(glmm_beta, ci = 0.95, method = "boot")
icc(glmm_beta, ci = 0.95, method = "analytical")
(Installed both {performance} and {insight} from the latest commits, https://github.com/easystats/performance/commit/06ed6f11f54c6fd35f952c30885b453f6cc89843 and https://github.com/easystats/insight/commit/d346f9527c3b3eb45283118f44b902806b9b7cd7)
- I think this should work with the devel version, simulation bug fixed in Simulate new init fix glmmTMB/glmmTMB#1008 2.Hmm. ordbeta uses the same
$variance
and definition ofsigma()
as beta (just added a note to?sigma.glmmTMB
): see https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/src/glmmTMB.cpp#L733-L737, but I think the actual conditional variances will not be the same, because the parameterization is relative to the underlying/baseline beta distribution, not counting the modifications created by moving to the ordered-beta specification. You might be able to work this out by hand (by analogy with the variances of ZI models, hurdle models, etc. etc.), or maybe look in Kubinec's paper/ask him if he knows the answer??
Thanks! Will open a new issue and close this as resolved for now.
In case anyone tries icc for beta family and thinks they're weird - I think so too, they are quite high, even when you draw two completely unrelated samples from a beta distribution (like 0.4-0.5). Would love to test it thoroughly, but don't have the time currently. But I posted an example that could be used to dig in here.
The ICC generated from
icc()
for a glmmTMB model with abeta_family()
family seems suspiciously low now. Compared to either the variance decomposition of a similar Bayesian model, or the same model specified with agaussian()
family (the Bayesian and Gaussian seem quite in agreement).So this is kind of a continuation of https://github.com/easystats/insight/issues/664, but decided to create a new issue rather than re-open, as there seems to be an extra issue with the bootstrapping in
icc()
. Maybe the new {insight} developments have not made their way into the bootstrapping, IDK. Currently, the estimate is fery low, but the CI is quite high, so the estimate falls well outside the CI.The reprex is run on the latest commit into main in {performance}.
Here's a reprex:
Created on 2024-07-05 with reprex v2.1.0