vincentarelbundock / marginaleffects

R package to compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and ML models. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference
https://marginaleffects.com
Other
466 stars 47 forks source link

`vcov = "satterthwaite" / "kenward-roger"` fails when aggregating #1252

Closed mattansb closed 2 weeks ago

mattansb commented 2 weeks ago
library(marginaleffects)
library(lmerTest)

data("sleepstudy", package="lme4")
sleepstudy$G <- as.numeric(sleepstudy$Subject) %% 2

m <- lmer(Reaction ~ G + Days + (Days | Subject), sleepstudy)

slopes(m, variables = "Days",
       vcov = "satterthwaite")
#> 
#>  Estimate Std. Error     t Pr(>|t|)    S 2.5 % 97.5 %   Df
#>      19.8       1.55 12.78   <0.001 30.4 16.48   23.0 16.2
#>      19.8       1.55 12.78   <0.001 30.2 16.48   23.0 16.0
#>      19.8       1.55 12.78   <0.001 30.9 16.49   23.0 16.7
#>      19.8       1.55 12.78   <0.001 32.2 16.51   23.0 17.9
#>      19.8       1.55 12.78   <0.001 33.7 16.52   23.0 19.3
#> --- 170 rows omitted. See ?avg_slopes and ?print.marginaleffects --- 
#>      11.6       1.55  7.53   <0.001 22.0  8.43   14.9 20.5
#>      11.6       1.55  7.54   <0.001 22.3  8.44   14.9 21.4
#>      11.6       1.55  7.53   <0.001 22.5  8.44   14.9 21.7
#>      11.6       1.55  7.54   <0.001 22.5  8.44   14.9 21.7
#>      11.6       1.55  7.54   <0.001 22.4  8.44   14.9 21.5
#> Term: Days
#> Type:  response 
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, Reaction, G, Days, Subject, df

slopes(m, variables = "Days", by = "G",
       vcov = "satterthwaite")
#> Error: Satterthwaite and Kenward-Roger corrections are not supported in this
#>   command.

avg_slopes(m, variables = "Days",
           vcov = "satterthwaite")
#> Error: Satterthwaite and Kenward-Roger corrections are not supported in this
#>   command.

Created on 2024-10-27 with reprex v2.1.1

True for predictions and comparisons as well.

mattansb commented 2 weeks ago

Okay, I see this is because these DFs are difficult (impossible?) to compute.

However, I think having this result in an error is harsh (especially considering this error might occur after long computation times). I think a warning + setting df=Inf would be better here.

vincentarelbundock commented 2 weeks ago

Yeah, right, I couldn't find a way to do this for aggregated values.

Error vs. Warning is always a judgment call, and I'm never 100% sure what to do. But in this case, it feels like an error, because the user is explicitly requesting something, and we are giving them something different.

Consider a different case: the warning on aggregation for frequentist mixed effects models. There, the user doesn't request anything specific for uncertainty, and we issue a warning about re.form. This seems appropriate because the user didn't say anything, but what we supply is probably not what what they want.

Here, marginaleffects know for sure that what we supply is not what the user wants, so it's an error.

Anyway, that's the kind of rule I have in mind...

mattansb commented 2 weeks ago

Personally, I would prefer a result close to what I asked for than none at all (AFAIK the functions don't fail when the requested quantities are NAs, which is much more "severe" than wrong dfs).

But it's your call (:

vincentarelbundock commented 2 weeks ago

hmm, I don't think it's as severe. Nobody can publish a table with just NAs in them! But they could publish incorrect numerical results without noticing the error.