jepusto / lmeInfo

Information matrices for fitted nlme::lme and nlme::gls models
https://jepusto.github.io/lmeInfo/
4 stars 2 forks source link

Using `lmeInfo` for models without intercepts #55

Open b-steve opened 3 months ago

b-steve commented 3 months ago

Thanks for the great package!

In the document "Information matrices for fitted lme() and gls() models", it says "The cross-derivatives involving β ... have expectation 0 and thus might be ignored".

Models can behave slightly strangely if they don't have an intercept. For example, everyone knows that the sum of residuals from a linear model is equal to zero, but this doesn't hold in general for a linear model without an intercept.

Based on some simulations, I suspect the statement about cross-derivatives involving β having expectation 0 is not true for models without intercepts (i.e., those fitted with lme(fixed = y ~ 0 + x, ...)). Does that sound correct? If so, to what extent is output from lmeInfo correct for models without intercepts?

If lmeInfo shouldn't be used with models that have been fitted without intercepts, I'd suggest adding a warning/error when functions are called.

jepusto commented 3 months ago

Thanks for the comment. I can see that the claim would not necessarily hold for a model like y ~ 0 + x, where x is a continuous predictor. But can you offer any examples of when such models would ever be used?

I suspect there's a bit of nuance to determining if/when the cross-derivatives would be non-null. For instance, one common use-case for dropping an intercept is when you've got a categorical factor and want to have separate intercepts for each category, as in y ~ 0 + fac + x. The cross-derivatives shouldn't be a problem here because the model is just a reparameterization of y ~ 1 + fac + x.

b-steve commented 3 months ago

Hmm, you've just described my exact scenario of fitting separate intercepts to each category. I diagnosed that the issue was the missing intercept, because simulations revealed covariance between estimators for beta and covariance-structure parameters, but it vanishes when I reparameterised via y ~ 1 + fac.

It's possible the issue is arising because of other things going on in the model, because the same factor appears elsewhere:

fit <- lme(fixed = y ~ 0 + fac1,
               random = ~ 0 + fac1 | fac2,
               correlation = corSymm(form = ~ 1 | fac2 / fac3),
               weights = varIdent(form = ~ 1 | fac1))

So, for example, I wonder if the issue is caused by having no intercept in both the fixed and random arguments, which both involve the same factor. Or maybe because I have heterogeneous error variance depending on the same factor's groups.

I don't have good intuition for the theory. I am happy to puzzle over this a bit more to see if I can figure it out.

(This model might look a bit strange without further explanation, in which case you'll just have to trust that I am doing something sensible!)