Dramatic speed differences between R 4.1.1 and R 4.2.1 for constraint models

FlorianScharf commented 2 years ago

In a project where we fit many different models to a single data set, we noticed that some models with positivity constraints became unbearably slow out of a sudden. We have already narrowed it down to a reproducible example using the Holzinger data set.

I fit two models to the data set one with positivity constraints on all loadings (model) and one without (model2). The constraint optimization takes longer under both versions but under R 4.2.1 the constraint estimation is slower by a factor of 10 whereas the unconstraint version is still as fast as ever.

It seems this behavior is related to a specific model specification because other constraint models on the same data set do not show a similar behavior. I attach code to reproduce the issue with a minimal example using the Holzinger data set. I knitted the output of both benchmarks under both versions of R to show that the issue occurs solely in the combination of Version 4.2.1 with the "bad" model.

I made sure, the same version of lavaan was used and included the Session Info. Is there any known issue with the constraint optimizer in the newer version of R?

We would appreciate any help to further narrow down the cause of this behavior – I am happy to assist with more fits/information.

Thanks!

Archiv.zip .

sfcheung commented 2 years ago

This is an interesting case. I did some informal tests and I guess the difference is between R 4.1.3 and R 4.2.0. The slow optimization can also be found in R 4.2.0, but not in R 4.1.3. These tests are informal because they are conducted on old installations on my computer and I cannot guarantee the packages are of the same version, except for lavaan.

I ran the script in Archiv.zip in an interactive session, and trace lavaan:::nlminb.constr()

trace(lavaan:::nlminb.constr, browser, at = 39)
fit_model <- cfa(model = model,
                  data = HS_data,
                  estimator = "ML",
                  std.lv = T,
                  se = "none",
                  orthogonal = TRUE,
                  bounds = "pos.var")

Just before the call to nlminb():

optim.out <- nlminb(start = x.par, objective = auglag,
                    gradient = fgrad, control = control,
                    lower = lower, upper = upper,
                    scale = scale, ...)

I saved the arguments (x.par, auglag, fgrad, control, lower, upper, scale, list(...)) to an external file using save() while in brower mode (file name based on the version of R):

save(x.par, auglag, fgrad, control, lower, upper, scale,
     file = "nlminb_constr_4_2_1.RData")

I did this in both R 4.2.1 and R 4.1.1. Though unlikely to be different, I did this to make sure that the arguments are indeed the same (except for the environments of the functions)

args421 <- load("nlminb_constr_4_2_1.RData")
for (xx in args421) {
    assign(paste0(xx, "_421"), get(xx))
  }
args411 <- load("nlminb_constr_4_1_1.RData")
for (xx in args411) {
    assign(paste0(xx, "_411"), get(xx))
  }
rm(list = args411)
identical(auglag_421, auglag_411, ignore.environment = TRUE)
identical(control_421, control_411)
identical(fgrad_421, fgrad_411, ignore.environment = TRUE)
identical(lower_421, lower_411)
identical(upper_421, upper_411)
identical(scale_421, scale_411)
identical(x.par_421, x.par_411)

It seems that they are the same:

> identical(auglag_421, auglag_411, ignore.environment = TRUE)
[1] TRUE
> identical(control_421, control_411)
[1] TRUE
> identical(fgrad_421, fgrad_411, ignore.environment = TRUE)
[1] TRUE
> identical(lower_421, lower_411)
[1] TRUE
> identical(upper_421, upper_411)
[1] TRUE
> identical(scale_421, scale_411)
[1] TRUE
> identical(x.par_421, x.par_411)
[1] TRUE

Environments matter. Therefore, I ran the following in R 4.2.1 and R 4.1.1 (verbose and debug were set to FALSE as they were in the original call by lavaan():

system.time(optim.out.421 <- nlminb(start = x.par_421,
                                    objective = auglag_421,
                                    gradient = fgrad_421,
                                    control = control_421,
                                    lower = lower_421,
                                    upper = upper_421,
                                    scale = scale_421,
                                    verbose = FALSE,
                                    debug = FALSE))
system.time(optim.out.411 <- nlminb(start = x.par_411,
                                    objective = auglag_411,
                                    gradient = fgrad_411,
                                    control = control_411,
                                    lower = lower_411,
                                    upper = upper_411,
                                    scale = scale_411,
                                    verbose = FALSE,
                                    debug = FALSE))

Both were slow in R 4.2.1 and much faster in R 4.1.1. I used the following to capture the output of the first few iterations (I limit to the first few iterations because preliminary examination showed they diverge early in the optimization:

On R 4.2.1:

control_421$trace <- 1
control_421$iter.max <- 10
out_421 <- capture.output(optim.out.421 <- nlminb(start = x.par_421,
                                    objective = auglag_421,
                                    gradient = fgrad_421,
                                    control = control_421,
                                    lower = lower_421,
                                    upper = upper_421,
                                    scale = scale_421,
                                    verbose = FALSE,
                                    debug = FALSE))
write.csv(out_421, "out_421.csv")

On R 4.1.1:

control_411$trace <- 1
control_411$iter.max <- 10
out_411 <- capture.output(optim.out.411 <- nlminb(start = x.par_411,
                                    objective = auglag_411,
                                    gradient = fgrad_411,
                                    control = control_411,
                                    lower = lower_411,
                                    upper = upper_411,
                                    scale = scale_411,
                                    verbose = FALSE,
                                    debug = FALSE))
write.csv(out_411, "out_411.csv")

I compared the print out:

out_421 <- read.csv("out_421.csv")
out_411 <- read.csv("out_411.csv")
which(out_421[, 2] != out_411[, 2])
# [1]  9 10 11 12

The estimates began to differ at Iteration 8:

> out_421[8:9, 2]
[1] "  7:  -0.042900961: 0.616745 0.373027 0.358264 0.479655 0.532596 0.561795 0.527983 0.541281 0.584299 0.354587 0.591715 0.456330 0.552417 0.308729 0.260983 0.542160 0.337538 0.406011 0.459443 0.527480 0.623140 0.658873 0.692090 0.608464 0.289050 0.356177 0.284537 0.164360 0.671438 0.653483 0.0111314 0.108644 0.346822 0.289949 0.273374 0.369622 0.539872 0.535415 0.0722433 0.0165970 0.0440946 0.622481 0.841266 0.854870 0.689356 0.335933 0.353941 0.339485 0.407726 0.541543 0.860285 0.682908 0.792966 0.757540 0.832674 0.910732 0.599816 0.876753 0.775482 0.791249 0.668848 0.663033 0.605500 0.538830 0.536810"
[2] "  8:  -0.048981945: 0.616405 0.372543 0.358087 0.479312 0.533543 0.564579 0.529772 0.543420 0.588016 0.354174 0.589492 0.454726 0.551155 0.309178 0.260130 0.541014 0.337140 0.404735 0.458662 0.528124 0.622700 0.657164 0.692581 0.607561 0.288727 0.355873 0.283571 0.159486 0.668365 0.651910 0.00804966 0.104452 0.342501 0.287914 0.270560 0.366815 0.539432 0.532953 0.0651732 0.0117954 0.0400397 0.622772 0.841880 0.855260 0.690057 0.329476 0.357874 0.334744 0.409841 0.540968 0.860661 0.682808 0.793560 0.755502 0.832652 0.911688 0.600545 0.876950 0.775582 0.791331 
0.668417 0.660371 0.604528 0.536809 0.536540"
> out_411[8:9, 2]
[1] "  7:  -0.042900961: 0.616745 0.373027 0.358264 0.479655 0.532596 0.561795 0.527983 0.541281 0.584299 0.354587 0.591715 0.456330 0.552417 0.308729 0.260983 0.542160 0.337538 0.406011 0.459443 0.527480 0.623140 0.658873 0.692090 0.608464 0.289050 0.356177 0.284537 0.164360 0.671438 0.653483 0.0111314 0.108644 0.346822 0.289949 0.273374 0.369622 0.539872 0.535415 0.0722433 0.0165970 0.0440946 0.622481 0.841266 0.854870 0.689356 0.335933 0.353941 0.339485 0.407726 0.541543 0.860285 0.682908 0.792966 0.757540 0.832674 0.910732 0.599816 0.876753 0.775482 0.791249 0.668848 0.663033 0.605500 0.538830 0.536810"
[2] "  8:  -0.048383419: 0.616397 0.372549 0.358074 0.479305 0.533456 0.564311 0.529600 0.543211 0.587652 0.354235 0.589731 0.454888 0.551271 0.309133 0.260212 0.541129 0.337188 0.404871 0.458743 0.528044 0.622747 0.657338 0.692522 0.607680 0.288686 0.355824 0.283605 0.159862 0.668678 0.652041 0.00805717 0.104654 0.342807 0.288056 0.270744 0.366994 0.539375 0.533121 0.0658361 0.0122946 0.0402884 0.622772 0.841859 0.855268 0.690032 0.330030 0.357417 0.335131 0.409588 0.541024 0.860634 0.682824 0.793520 0.755734 0.832692 0.911615 0.600493 0.876950 0.775610 0.791341 
0.668493 0.660656 0.604634 0.537010 0.536566"

I don't know the reason but I found this interesting. Maybe it is something about nlminb()? But why did the difference occur there? It seems that the optimization process initially are identical, or at least nearly identical.

yrosseel commented 2 years ago

I have no explanation at the moment. I will investigate this later. But as an immediate solution, you can replace all '>0' constraints by

g =~ lower(0)*g_visual + lower(0)*g_cubes + ...

which will use the 'lower' argument of nlminb() directly, avoiding the nlminb.constr() function.

yrosseel commented 2 years ago

This may be platform dependent!

I just installed R 4.1.1 on my Arch linux system, and it ran as slow as in R 4.2.1. The number of iterations were identical across the two R versions.

I also checked the R source code: there are no differences in the file /src/library/stats/R/nlminb.R across R 4.1.1 and R 4.2.1, and only trivial changes in the file /src/library/stats/src/port.c (two calls to Calloc() and Free() were replaced by R_Calloc() and R_free()). So I don't think nlminb() is the culprit.

Perhaps this is related to the update of the LAPACK sources (version 3.10.0 in 4.2.0). But it does not affect anything on linux. @FlorianScharf: which platform are you on? Are you using the 'standard' (R) versions of blas/lapack?

The only 'solution' I can think of at the moment is that lavaan will transform all explicit simple 'bound' constraints (eg x > 5 or x < 10) to the lower/upper arguments of nlminb(), avoiding nlminb.constr() altogether.

FlorianScharf commented 2 years ago

I have no explanation at the moment. I will investigate this later. But as an immediate solution, you can replace all '>0' constraints by
g =~ lower(0)*g_visual + lower(0)*g_cubes + ...
which will use the 'lower' argument of nlminb() directly, avoiding the nlminb.constr() function.

Thank you very much, I can confirm that this works like a charm. In addition, this is much faster on any version of R. It is almost as fast as the unconstrained estimation.

This may be platform dependent!

I just installed R 4.1.1 on my Arch linux system, and it ran as slow as in R 4.2.1. The number of iterations were identical across the two R versions.

I also checked the R source code: there are no differences in the file /src/library/stats/R/nlminb.R across R 4.1.1 and R 4.2.1, and only trivial changes in the file /src/library/stats/src/port.c (two calls to Calloc() and Free() were replaced by R_Calloc() and R_free()). So I don't think nlminb() is the culprit.

Perhaps this is related to the update of the LAPACK sources (version 3.10.0 in 4.2.0). But it does not affect anything on linux. @FlorianScharf: which platform are you on? Are you using the 'standard' (R) versions of blas/lapack?

The only 'solution' I can think of at the moment is that lavaan will transform all explicit simple 'bound' constraints (eg x > 5 or x < 10) to the lower/upper arguments of nlminb(), avoiding nlminb.constr() altogether.

My original report refers to a Windows Plain R with the default blas/lapack. You are correct that it is platform dependent, the problem is much less pronounced on a Mac both with default R and Apple's Custom BLAS.

However, we also had the problem on a Linux Cluster to which I have no easy access at the moment. I can report back on the Linux part in a week or two.

yrosseel commented 1 year ago

Although still 'unresolved', this is most likely not lavaan -specific. Closing for now.

yrosseel / lavaan

Dramatic speed differences between R 4.1.1 and R 4.2.1 for constraint models #239