Profile failures on base vs devfun2 match test

bbolker commented 11 years ago

(Previously: "Lambdax size mismatch" when profiling on flexLambda branch)

This may already be known, but:

library(lme4)
fm1 <- lmer(R> fm1 <- lmer(Reaction~Days+(Days|Subject),sleepstudy)
pp <- profile(fm1)

I get

Error: Lambdax size mismatch

bbolker commented 11 years ago

I partly fixed this, but there is a small numerical glitch that probably deserves to be sorted out rather than papered over (most likely due to some failure to reset values in the environment properly): this is the effect of debugging through profile(fm1)

stopifnot(all.equal(unname(dd(opt[seq(npar1)])), base, tol = 1e-05))
Browse[2]> base
[1] 1751.939
Browse[2]> dd(opt[seq(npar1)])
  .sigma 
1751.997 
all.equal(unname(dd(opt[seq(npar1)])),base)
[1] "Mean relative difference: 3.294353e-05"

If we weren't worried (or wanted to move on to other tests) we could deal with this by increasing the tolerance to 1e-4 ...

mmaechler commented 11 years ago

@bbolker :+1: about deserves to be sorted out rather than papered over

bbolker commented 11 years ago

This is a more general issue, but the failure of the

all.equal(unname(dd(opt[seq(npar1)])),base)

test within profile() applies more broadly (i.e. on the master branch as well as the flexLambda branch) and is platform/(compiler?) specific. e.g. fm1 <- lmer(Reaction~Days+(Days|Subject), sleepstudy); profile(fm1) works on MacOS/R 3.0.0/gcc 4.2.1, fails on Ubuntu 12.04/R (2013-08-22 r63654/gcc 4.6.3-1ubuntu5 . I would be curious to know how widely this works or fails, if anyone feels like testing it ...

bbolker commented 11 years ago

I'm having my doubts about whether the variation I'm seeing in when this happens represents compiler/platform issues or whether this actually another, somewhat more subtle, failure to reset the environment of devfun properly, i.e. some stuff floating around within reference class objects and their environments (in which case the outcomes will be very hard to pin down).

bbolker commented 11 years ago

bump: I saw this again in another example (sorry, no reproducible example yet)

bbolker commented 10 years ago

maybe (probably?) related to https://github.com/lme4/lme4/issues/166. I don't have a great reproducible example to test with, but the profiling example above (base=1751.939, results of running devfun again=1751.997) appears to be fixed.

mmaechler commented 10 years ago

Ok, I now do have a nicely reproducible example. Here I show both the example and the debugging, including a first diagnosis:

Note that I'm using the very very latest master branch of lme4:

> packageDescription("lme4")
Package: lme4
Version: 1.1-2
Title: Linear mixed-effects models using Eigen and S4
Authors@R: c(person("Douglas","Bates", role="aut"),
        person("Martin","Maechler", role="aut"),
        person("Ben","Bolker",email="bbolker+lme4@gmail.com",
        role=c("aut","cre")), person("Steven","Walker",role="aut"))
Maintainer: Ben Bolker <bbolker+lme4@gmail.com>
Contact: LME4 Authors <lme4-authors@lists.r-forge.r-project.org>
Author: Douglas Bates [aut], Martin Maechler [aut], Ben Bolker [aut, cre],
        Steven Walker [aut]
Description: Fit linear and generalized linear mixed-effects models.  The
        models and their components are represented using S4 classes and
        methods.  The core computational algorithms are implemented using the
        Eigen C++ library for numerical linear algebra and RcppEigen "glue".
Depends: R (>= 2.14.0), Matrix (>= 1.1), methods, stats
LinkingTo: Rcpp, RcppEigen
Imports: graphics, grid, splines, parallel, MASS, nlme, lattice, minqa (>=
        1.1.15), numDeriv
Suggests: boot, PKPDmodels, MEMSS, testthat, ggplot2, mlmRev, optimx (>=
        2013.8.6), plyr, reshape, pbkrtest, Rcpp (>= 0.10.1), RcppEigen (>=
        0.3.1.2.3)
LazyData: yes
License: GPL (>=2)
URL: https://github.com/lme4/lme4/ http://lme4.r-forge.r-project.org/
BuildVignettes: yes
Built: R 3.0.2; x86_64-unknown-linux-gnu; 2013-12-28 16:12:26 UTC; unix

Now the simple reproducible example: Indeed, later we will see that this error will only happen when there are correlated random effects, or vector random effects which not only have variances but also covariances to be estimated (and profiled!) among the thetas:

> ## 1) profile function does not work for certain model, e.g.
> data("Dialyzer", package="nlme")
> Dform <- rate ~ (pressure + I(pressure^2) + I(pressure^3) + I(pressure^4)) * QB +
+     (pressure + I(pressure^2) | Subject)

> fm <- lmer(Dform, data=Dialyzer)
> summary(fm)
Linear mixed model fit by REML ['lmerMod']
Formula: rate ~ (pressure + I(pressure^2) + I(pressure^3) + I(pressure^4)) *      QB + (pressure + I(pressure^2) | Subject)
   Data: Dialyzer

REML criterion at convergence: 645.8

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.8542 -0.4454 -0.0531  0.5004  2.5245

Random effects:
 Groups   Name          Variance Std.Dev. Corr
 Subject  (Intercept)    2.246   1.499
          pressure      24.081   4.907    -0.51
          I(pressure^2)  2.172   1.474     0.31 -0.94
 Residual                3.318   1.821
Number of obs: 140, groups: Subject, 20

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         -15.9663     1.8866  -8.463
pressure             88.3629     7.8281  11.288
I(pressure^2)       -44.3062     9.2897  -4.769
I(pressure^3)         9.1656     4.2279   2.168
I(pressure^4)        -0.6897     0.6438  -1.071
QB300                -1.2634     2.8122  -0.449
pressure:QB300        0.6207    11.4900   0.054
I(pressure^2):QB300   3.1484    13.5086   0.233
I(pressure^3):QB300   0.1018     6.1064   0.017
I(pressure^4):QB300  -0.1719     0.9255  -0.186

Correlation of Fixed Effects:
            (Intr) pressr I(p^2) I(p^3) I(p^4) QB300  p:QB30 I(^2): I(^3):
pressure    -0.936
I(pressr^2)  0.882 -0.973
I(pressr^3) -0.833  0.936 -0.990
I(pressr^4)  0.793 -0.904  0.972 -0.995
QB300       -0.671  0.628 -0.592  0.559 -0.532
prssr:QB300  0.637 -0.681  0.663 -0.638  0.616 -0.941
I(^2):QB300 -0.606  0.669 -0.688  0.681 -0.669  0.891 -0.975
I(^3):QB300  0.577 -0.648  0.686 -0.692  0.689 -0.843  0.939 -0.990
I(^4):QB300 -0.552  0.629 -0.676  0.692 -0.696  0.804 -0.907  0.973 -0.995

Indeed, the above correlation matrix (of the fixed effects) show (what we hopefully all know: When using polynomial terms in linear models, you should rather ensure that they are basically uncorrelated, and for this exact reason, S and R have provided poly() to be used in the formula (the following verbose output is slightly "collapsed", manually. BTW, see how slow the convergence is in the end phase?):

> fm. <- lmer(rate ~ poly(pressure,4) * QB + (poly(pressure,2) | Subject), data=Dialyzer,
+             verbose=TRUE)
(NM) 20:  f = 660.719 at  1.00012  0.0485185 -0.0923457 1.02778 -0.0137037 1.02778
(NM) 40:  f = 654.577 at  1.01875  0.471713 -0.888333   1.1312  -0.308565  1.38787
(NM) 60:  f = 643.994 at  1.22094  2.3534  -4.206    1.51631 -1.6317   2.84634
(NM) 80:  f = 643.923 at  1.25797  2.76996 -4.9569   1.59875 -1.91438  3.17766
(NM) 100: f = 643.249 at  1.09593  2.35989 -4.20326  1.54685 -1.62152  2.86507
(NM) 120: f = 643.096 at  1.03017  2.50551 -4.40541  1.58868 -1.68132  2.93715
(NM) 140: f = 643.022 at  1.06923  2.54694 -4.44542  1.59068 -1.70892  2.93224
............
(NM) 640: f = 615.432 at  1.39031  6.00375 -4.49024  7.43391  3.59469  2.33549
(NM) 660: f = 615.226 at 1.37465 5.68965 -4.5889 7.98564 3.49232 2.45829
...........
(NM) 780:  f = 615.006 at  1.43294  5.35088 -4.89357  8.35899  3.60027  3.22784
(NM) 800:  f = 615.006 at  1.43551  5.36084 -4.89868  8.36099  3.57461  3.23276
...........
(NM) 1080: f = 615.006 at  1.43584  5.33139 -4.90124  8.34912  3.57013  3.24065
(NM) 1100: f = 615.006 at  1.43583  5.33148 -4.90119  8.3488   3.57025  3.24065
(NM) 1120: f = 615.006 at  1.43579  5.33132 -4.90111  8.34854  3.57035  3.24059
(NM) 1140: f = 615.006 at  1.43578  5.33134 -4.90098  8.34867  3.57027  3.24057
(NM) 1160: f = 615.006 at  1.43578  5.33141 -4.901    8.34856  3.57024  3.24063
(NM) 1180: f = 615.006 at  1.43577  5.33151 -4.90089  8.34854  3.57014  3.24069
(NM) 1200: f = 615.006 at  1.43578  5.33158 -4.90089  8.34853  3.57014  3.24068
(NM) 1220: f = 615.006 at  1.43577  5.33159 -4.90087  8.34853  3.57012  3.24068
> options(width=88)
> summary(fm.)# corr.matrix looks *much* better
Linear mixed model fit by REML ['lmerMod']
Formula: rate ~ poly(pressure, 4) * QB + (poly(pressure, 2) | Subject)
   Data: Dialyzer

REML criterion at convergence: 615

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.8542 -0.4455 -0.0531  0.5003  2.5245

Random effects:
 Groups   Name               Variance Std.Dev. Corr
 Subject  (Intercept)          6.839   2.615
          poly(pressure, 2)1 325.528  18.042    0.54
          poly(pressure, 2)2 156.807  12.522   -0.71  0.05
 Residual                      3.318   1.821
Number of obs: 140, groups: Subject, 20

Fixed effects:
                         Estimate Std. Error t value
(Intercept)               33.7668     0.8552   39.49
poly(pressure, 4)1       140.1663     6.2597   22.39
poly(pressure, 4)2       -99.6702     4.7184  -21.12
poly(pressure, 4)3        27.5150     2.5488   10.80
poly(pressure, 4)4        -2.7406     2.5585   -1.07
QB300                      7.2552     1.2094    6.00
poly(pressure, 4)1:QB300  80.0285     8.8551    9.04
poly(pressure, 4)2:QB300   4.6316     6.6849    0.69
poly(pressure, 4)3:QB300  -6.0820     3.6597   -1.66
poly(pressure, 4)4:QB300  -0.6832     3.6779   -0.19

Correlation of Fixed Effects:
            (Intr) pl(,4)1 pl(,4)2 pl(,4)3 pl(,4)4 QB300  p(,4)1: p(,4)2: p(,4)3:
ply(prs,4)1  0.475
ply(prs,4)2 -0.579  0.044
ply(prs,4)3  0.004 -0.005   0.015
ply(prs,4)4 -0.005  0.016  -0.015   0.040
QB300       -0.707 -0.336   0.409  -0.002   0.003
p(,4)1:QB30 -0.336 -0.707  -0.031   0.004  -0.011   0.474
p(,4)2:QB30  0.409 -0.031  -0.706  -0.011   0.010  -0.577  0.041
p(,4)3:QB30 -0.002  0.004  -0.010  -0.696  -0.028   0.000  0.002  -0.002
p(,4)4:QB30  0.003 -0.011   0.010  -0.028  -0.696   0.001 -0.001   0.004  -0.006

Now, indeed the fixed effects are not so much correlated anymore, so let's say the model seems reasonable. Ok, so lets do inference for the fixed effects, using profiled confidence intervals:

> ##-> only do fixed effects:
> pfm. <- profile(fm., which = "beta_", alphamax=.001, verbose=1)

From original which = "beta_": new which <- 8:17
fixed-effect parameter 1:
Error in (function (par, fn, lower = -Inf, upper = Inf, control = list(),  :
  Starting values violate bounds

Ok, lets look where the error happened:

> traceback() 
8: stop("Starting values violate bounds")
7: (function (par, fn, lower = -Inf, upper = Inf, control = list(),
       ...)
   {
     ................
   })(fn = function (theta)
   .Call(lmer_Deviance, pp$ptr(), resp$ptr(), as.double(theta)),
       par = c(1.39439470717883, 5.20070945380043, -4.7803971684443,
       8.04733323639214, 3.52842648384625, 2.90515868079532), lower = c(0,
       -1, -1, 0, -1, 0), upper = c(Inf, 1, 1, Inf, 1, Inf), control = list())
6: do.call(optfun, arglist)
5: withCallingHandlers(do.call(optfun, arglist), warning = function(w) {
       curWarnings <<- append(curWarnings, list(w$message))
   })
4: optwrap(optimizer, par = thopt, fn = mkdevfun(rho, 0L), lower = pmax(fitted@lower,
       -1), upper = 1/(fitted@lower != 0))
3: fe.zeta(est + delta * std)
2: profile.merMod(fm., which = "beta_", alphamax = 0.001, verbose = 1)
1: profile(fm., which = "beta_", alphamax = 0.001, verbose = 1)
>

Now, additional good (later confirmed) guessing says, it is in bobyqa() {yes, here: bobyqa is used even though lmer() used its (doubtful!) default "Nelder-Mead"} :

> debug(minqa:::bobyqa)
> pfm. <- profile(fm., which = "beta_", alphamax=.001, verbose=1)

debugging in: (function (par, fn, lower = -Inf, upper = Inf, control = list(),
    ...)
{
    nn <- names(par)

.................

Browse[2]> rbind(lower, par, upper)
          [,1]     [,2]      [,3]     [,4]    [,5]     [,6]
lower 0.000000     -Inf      -Inf 0.000000    -Inf 0.000000
par   1.435776 5.331601 -4.900866 8.348519 3.57011 3.240686
upper      Inf      Inf       Inf      Inf     Inf      Inf

Aha, the above is all fine, so the first call to bobyqa() had things all correct, so we type c (i.e. _c_ontinue) in the browser,

Browse[2]> c

the code continues, and we get into debugging bobyqa() a second time:

Browse[2]> rbind(lower, par, upper)
          [,1]      [,2]      [,3]     [,4]      [,5]     [,6]
lower 0.000000 -1.000000 -1.000000 0.000000 -1.000000 0.000000
par   1.394395  5.200709 -4.780397 8.047333  3.528426 2.905159
upper      Inf  1.000000  1.000000      Inf  1.000000      Inf
Browse[2]> c
Error in (function (par, fn, lower = -Inf, upper = Inf, control = list(),  :
  Starting values violate bounds
>

and this time, the (lower, upper) bounds are indeed clearly wrong, and the error is "obvious" : the covariance parameters are treated as if they were correlation parameters!

Now, I could dive further into debugging this behavior, it may be that someone else {who wrote that part of code ...} might rather have a look.

bbolker commented 10 years ago

I believe these are all fixed now ...

lme4 / lme4

Profile failures on base vs devfun2 match test #110