JuliaStats / MixedModels.jl

A Julia package for fitting (statistical) mixed-effects models
http://juliastats.org/MixedModels.jl/stable
MIT License
407 stars 48 forks source link

Satterthwaite Degrees of Freedom Approximation #305

Open Nosferican opened 4 years ago

Nosferican commented 4 years ago

Satterthwaite Degrees of Freedom Approximation.

Nosferican commented 4 years ago

Related. The containment degrees of freedom which is the default on SAS for

proc mixed data=data;
class sequence subject period formula;
model ln_auc= sequence period formula/outp=out;
random subject(sequence);

Is computed by

n, rank_fe, rank_re, _ = size(model)
ddf = rank(hcat(modelmatrix(model), model.reterms[1])) - rank_fe
TarandeepKang commented 3 years ago

It may also be advantageous to include the Kenward-Roger approximation 4° of freedom. Both of these can be calculated using the LmerTest Package in R.

palday commented 3 years ago

The problem with Kenward-Roger is that it calls for the inverse of an n_obs x n_obs matrix. There is a literal matrix inverse in pbkrtest which is what lmerTest uses to compute the KR degrees of freedom. This is horrible from a computational perspective and also why the examples in the vignettes for those packages use small datasets. For many datasets, the (approximated) degrees of freedom are often much larger than 30, so the z for t (i.e. "infinite" degrees of freedom) approximation we currently use is very close -- essentially indistinguishable.There are other problems with the KR approach, but @dmbates and I have enough skepticism that we probably won't ever implement it ourselves. (That said, I'm currently working on a MixedModelsExtras package where I would be willing to accept a KR implementation as a pull request, but would be unwilling to write it myself.)

Instead, I would strongly look at using bootstrapped confidence intervals. Confidence intervals are generally more informative than p-values and they sidestep the degrees of freedom issue. Moreover, CIs can still be used for significance testing. The bootstrap implementation in MixedModels is good enough that it should in most cases be faster than doing the KR approximation in R.

(Sorry for being such a downer -- we do appreciate feedback on what features the users want! It's just that the KR approach is computational expensive for essentially zero gain for datasets with more than a few dozen observations,)

TarandeepKang commented 3 years ago

No worries, I see that it might not of been a particularly good idea! Still, thanks for being kind enough to give me a detailed answer! If I might ask, what kind of things will your in progress package include? I currently use are for most of my modelling, but I'm looking forward to trying to get stuck in with Julia in the future!