Closed njhenry closed 3 years ago
@njhenry Thanks for taking the time to file this accurate report.
Are you by any chance using Matrix package version 1.3-0
, 1.3-1
or 1.3-2
?
Hi @kaskr, thanks for your response. I am using Matrix package version 1.3-2, last updated on 6 Jan 2021.
You are probably seeing same issue as https://github.com/kaskr/adcomp/issues/340 and https://github.com/glmmTMB/glmmTMB/issues/665
Solution Upgrade TMB to version 1.7.20
Fantastic—after updating TMB to 1.7.20, both my toy example and my original model finished successfully. Thanks very much for your help.
Description:
For TMB models with large numbers of random effects, after calling the likelihood and gradient functions
obj$fn()
andobj$gr()
many times, these functions will start to return NaN values given starting parameters that originally yielded valid results. I first encountered this behavior in a TMB model with 148,000 random effects, where this behavior consistently appeared after 142 iterations of my outer optimizer (both nlminb and L-BFGS-B, called using optimx), causing the optimization to fail. I found that when I saved the most recent fixed effects,obj$env$last.par.best[1:length(obj$par)]
, restarted my R session, assigned these values as the starting values forobj$par
, and restarted the optimizer, the model would converge.I have reproduced this behavior using a simple mixed-effects model below. My apologies if this is a problem with my implementation rather than a bug - I wasn't able to find any examples of this behavior on the Google Group or in previous issues, and I would appreciate any suggestions you might have for addressing it.
Reproducible Steps:
I reproduced the issue using a mixed-effects extension of the linreg.cpp template from the TMB book, with 1E6 data observations and 2E5 random effects. The objective function is created with
random.start = expression(rep(0, length(random)))
so that the starting values for both fixed and random effects remain constant across calls to the likelihood and gradient functions. I callobj$fn(obj$par)
andobj$gr(obj$par)
2000 times, recording the results and execution time for each iteration.Expected Output:
For iterations 1-1340, the output looks as expected, with the same likelihood and gradient values returned across iterations:
For iterations 1-1340, the median execution time for each call to
obj$fn(); obj$gr()
was 2.23 seconds.Current Output:
Starting at iteration 1341, calls to both
obj$fn()
andobj$gr()
consistently return NaN:For iterations 1341-2000, the median execution time for each loop iteration was 69.90 seconds.
TMB Version:
1.7.19
R Version:
4.0.3, installed and managed using conda 4.9.2 (linux-64 version)
Operating System:
CentOS Linux 7