[BUGZILLA #16975] lme fixed sigma - inconsistent REML estimation

MichaelChirico commented 4 years ago

I would like to estimate Fay-Herriot class models in nlme (small area models). Basically, these models have fixed random error which is assumed to be known from sample survey (sampling error). Hence, the model I would like to specify should have sigma = 1 (it is not estimated).

I have checked new version nlme package (3.1-128) however results are different from those from sae package and proc mixed when it comes to fitting a small area model (in particular a Fay-Herriot area model).

I am not sure why these results differ. They should be the same because sae::eblupFH fits s mixed model with one random effect and fixed residual variance).

Moreover, prof. Viechtbauer Wolfgang also shown that the problem is probably related to REML estimation because ML estimates are the same as in metafor package (the last part).

############################## ## preparation ##############################

library(sae) library(nlme) library(metafor) data(milk) milk$var <- milk$SD^2

############################## ### nlme results ##############################

m2 <- lme(fixed = yi ~ as.factor(MajorArea),

       random =&sim; 1 | SmallArea,
       control = lmeControl(sigma = 1, 
                            apVar = T),
       weights = varFixed&sim;var),
       data = milk)

## variance (not the same as in sae and proc mixed) 0.1332918^2 = 0.0177667

summary(m2)

Linear mixed-effects model fit by REML Data: milk AIC BIC logLik -10.69175 -2.373943 10.34588

Random effects: Formula:∼1 | SmallArea (Intercept) Residual StdDev: 0.1332918 1

Variance function: Structure: fixed weights Formula:∼var Fixed effects: yi∼ as.factor(MajorArea) Value Std.Error DF t-value p-value (Intercept) 0.9680768 0.06849017 39 14.134537 0.0000 as.factor(MajorArea)2 0.1316132 0.10183884 39 1.292367 0.2038 as.factor(MajorArea)3 0.2269008 0.09126952 39 2.486053 0.0173 as.factor(MajorArea)4 -0.2415905 0.08058755 39 -2.997863 0.0047

############################## ### results based on sae package ##############################

library(sae) va <- eblupFH(formula = yi∼ as.factor(MajorArea), vardir = var, data = milk, method = "REML")

va$fit

$method [1] "REML"

$convergence [1] TRUE

$iterations [1] 4

$estcoef beta std.error tvalue pvalue (Intercept) 0.9681890 0.06936208 13.958476 2.793443e-44 as.factor(MajorArea)2 0.1327801 0.10300072 1.289119 1.973569e-01 as.factor(MajorArea)3 0.2269462 0.09232981 2.457995 1.397151e-02 as.factor(MajorArea)4 -0.2413011 0.08161707 -2.956503 3.111496e-03

$refvar [1] 0.01855022

$goodness loglike AIC BIC KIC AICc AICb1 AICb2 KICc 12.677478 -15.354956 -6.548956 -10.354956 NA NA NA NA KICb1 KICb2 nBootstrap NA NA 0.000000

############################## ### Proc mixed results are consistent with sae ##############################

proc SmallArea data= milk order=data method=reml; class SmallArea; weight var; model y= MajorArea2 MajorArea3 MajorArea4 / cl solution outp=predicted; random SmallArea; parms (1) (1) / hold=2; run;

## Covariance Parameter Estimates Cov Parm Estimate SmallArea 0.01855 Residual 1.0000

### fixed effects Intercept 0.9682 majorarea2 0.1328 majorarea3 0.2269 majorarea3 -0.2413

############################## ### metafor package REML vs ML ##############################

milk$var <- milk$SD^2 res <- rma(yi∼ as.factor(MajorArea), var, data=milk) res res$tau2

Yields:

Mixed-Effects Model (k = 43; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity): 0.0185 (SE = 0.0079) tau (square root of estimated tau^2 value): 0.1362 I^2 (residual heterogeneity / unaccounted variability): 55.21% H^2 (unaccounted variability / sampling variability): 2.23 R^2 (amount of heterogeneity accounted for): 65.85%

Test for Residual Heterogeneity: QE(df = 39) = 86.1840, p-val < .0001

Test of Moderators (coefficient(s) 2,3,4): QM(df = 3) = 46.5699, p-val < .0001

Model Results:

                  estimate      se     zval    pval    ci.lb    ci.ub

intrcpt 0.9682 0.0694 13.9585 <.0001 0.8322 1.1041 as.factor(MajorArea)2 0.1328 0.1030 1.2891 0.1974 -0.0691 0.3347
as.factor(MajorArea)3 0.2269 0.0923 2.4580 0.0140 0.0460 0.4079 as.factor(MajorArea)4 -0.2413 0.0816 -2.9565 0.0031 -0.4013 -0.0813

--- Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1

res$tau2 [1] 0.01854996

Same as in 'sae' (rounded to 6 decimals) and SAS.

Not a huge difference to lme(), but larger than one would expect due to numerical differences.

If you switch to method="ML" (for both lme() and rma()), then you get 0.1245693^2 = 0.01551751 for lme() and 0.0155174 for rma(), so that's basically the same.

—————— Maciej Beręsewicz

METADATA

Bug author - Maciej
Creation time - 2016-06-29 05:06:23 UTC
Bugzilla link
Status - ASSIGNED
Alias - None
Component - Models
Version - R 3.3.*
Hardware - x86_64/x64/amd64 (64-bit) OS X El Capitan
Importance - P5 major
Assignee - R-core
URL -
Modification time - 2017-03-07 10:10 UTC

MichaelChirico commented 4 years ago

Please find below code provided by prof. from S-PLUS version. Results are the same as in sae, metafor and proc mixed.

Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 6.02487 14.34268 1.987565

Random effects: Formula: ∼ 1 | SmallArea (Intercept) Residual StdDev: 0.1361994 1

Variance function: Structure: fixed weights Formula: ∼ var Fixed effects: yi∼ as.factor(MajorArea) Value Std.Error DF t-value p-value (Intercept) 0.9977953 0.03179340 39 31.38373 <.0001 as.factor(MajorArea)1 0.0663901 0.05150041 39 1.28912 0.2050 as.factor(MajorArea)2 0.0535187 0.02659572 39 2.01230 0.0511 as.factor(MajorArea)3 -0.0903025 0.01466646 39 -6.15707 <.0001 Correlation: (Intr) a(MA)1 a(MA)2 as.factor(MajorArea)1 0.075 as.factor(MajorArea)2 -0.157 -0.060 as.factor(MajorArea)3 -0.392 -0.054 0.113

Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.702152 -0.2304439 0.2200099 0.3981538 1.22415

Number of Observations: 43 Number of Groups: 43

0.1361994^2

[1] 0.01855028

METADATA

Comment author - Maciej
Timestamp - 2016-06-29 11:47:23 UTC

MichaelChirico commented 4 years ago

Bug acknowledged.

Waiting for patches to the source (-> https://svn.r-project.org/R-packages/trunk/nlme/R/ notably VarCorr.R, maybe lme.R and possibly in more source files. I'm pretty sure you need another check along if(attr(*, "fixedSigma")) {...... } else

One problem I see: When we added the "fixedSigma" code, quite a few checks were also added, and these were said to give results compatible with S+ ("S-PLUS") according to people who have had access to S+.
I would expact that the VarCorr part of these results would also change if fixing this bug leads to (slightly) different variance estimates.

These are the tests I mean: https://svn.r-project.org/R-packages/trunk/nlme/tests/sigma-fixed-etc.R

I / we would be very grateful to get the result of running these in S+ nlme, i.e. for the resulting file of something like Splus BATCH sigma-fixed-etc.R

METADATA

Comment author - Martin Maechler
Timestamp - 2017-03-07 10:10:57 UTC

MichaelChirico / r-bugs