step halving failed on a contraception example

stevencarlislewalker commented 11 years ago

Here's the example,

> library(lme4pureR)
> glmerReproduce <- FALSE
> tol <- 1e-3
> form <- use ~ age + I(age^2) + ch + urban + (1|district)
> data(Contraception, package = 'mlmRev')
> Contraception <- within(Contraception, ch <- factor(as.numeric(as.integer(livch)>1L)))
> glm0 <- glm(lme4:::nobars(form),binomial,Contraception)
> ll <- plsform(form, data = Contraception, family = binomial)
> devf <- do.call(pirls, c(ll, list(family=binomial,eta=glm0$linear.predictor, tol=1e-6)))
> opt <- minqa:::bobyqa(c(ll$theta, coef(glm0)), devf)
Error in fn(x, ...) : Step-halving failed

If we lower the tolerance it gets close to the glmer answer,

> devf <- do.call(pirls, c(ll, list(family=binomial,eta=glm0$linear.predictor, tol=1e-3)))
> opt <- minqa:::bobyqa(c(ll$theta, coef(glm0)), devf)
> opt
parameter estimates: 0.479434041529781, -0.982787192490493, 0.00630218853107833, -0.00463583246031576, 0.859308838506257, 0.690783737758705 
objective: 2372.6399833919 
number of function evaluations: 3607 
> library(lme4)
Loading required package: lattice
Loading required package: Matrix
> glmer(form, data = Contraception, family = binomial)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
 Family: binomial ( logit )
Formula: use ~ age + I(age^2) + ch + urban + (1 | district) 
   Data: Contraception 

      AIC       BIC    logLik  deviance 
 2385.186  2418.590 -1186.593  2373.186 

Random effects:
 Groups   Name        Variance Std.Dev.
 district (Intercept) 0.2247   0.474   
Number of obs: 1934, groups: district, 60

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -1.0063737  0.1678918  -5.994 2.05e-09 ***
age          0.0062561  0.0078403   0.798    0.425    
I(age^2)    -0.0046353  0.0007163  -6.471 9.74e-11 ***
ch1          0.8603821  0.1473528   5.839 5.25e-09 ***
urbanY       0.6929220  0.1196676   5.790 7.02e-09 ***

stevencarlislewalker commented 11 years ago

With lower tolerance, convergence is very slow (~10-15s on my Macbook).

dmbates commented 11 years ago

The problem is the first step chosen for the fixed-effects coefficients. The magnitudes of the two coefficients for the age variable are so small that a step of, say, 0.1 sends the algorithm to never-never land. The solution is to use a more reasonable initial step.

stevencarlislewalker commented 11 years ago

Interesting. Why does glmer work fine? Is it choosing a more reasonable initial step somehow?

dmbates commented 11 years ago

This is a case where the initial iterations with nAGQ=0 are helpful in obtaining a refinement of the fixed-effects coefficients and in giving reasonable standard errors to determine the step size for the optimizer when using nAGQ>0.

stevencarlislewalker commented 11 years ago

Of course. Thanks Doug. But maybe we only need one or two nAGQ=0 iterations.

stevencarlislewalker commented 11 years ago

I'm a bit confused by this behaviour,

gmod <- glFormula(form, data = Contraception, family = binomial, nAGQ = 1)
devf <- do.call(mkGlmerDevfun, gmod)
# skip initial nAGQ=0 optimization
devf <- updateGlmerDevfun(devf, gmod$reTrms)
optimizeGlmer(devf, stage=2)$par
[1]  0.474010082 -1.006445615  0.006255540 -0.004635385  0.860439478
[6]  0.692959336

Which works fine, despite skipping the initial nAGQ=0 step. I could be wrong about the fact that the initial step is completely skipped. Perhaps the resolution is that in setting up devf, we get one round of nAGQ=0, which smooths everything over. I look into it.

stevencarlislewalker commented 11 years ago

More to the point, this also fails,

glmer0 <- glmer(form, data = Contraception, family = binomial, nAGQ = 0)
ll <- plsform(form, data = Contraception, family = binomial)
devf <- do.call(pirls, c(ll, list(family=binomial,eta=qlogis(getME(glmer0, 'mu')), tol=1e-6)))
opt <- minqa:::bobyqa(c(glmer0@theta,glmer0@beta), devf)
Error in fn(x, ...) : Step-halving failed

Which is surprising because I've started the optimizer at the nAGQ=0 optimum. Which seems to suggest that there might actually be something wrong with step-halving itself.

dmbates commented 11 years ago

Failure of step-halving is not the problem, it's the symptom. Switch from tol=1e-6 to verbose=2L and see where the algorithm is trying to evaluate the Laplace approximation. It is way out in the boondocks and the nature of the link function means that small changes in value of u do not change the penalized weighted residual sum-of-squares.

lme4 / lme4pureR

step halving failed on a contraception example #6