bachmannpatrick / CLVTools

R-Package for estimating CLV
54 stars 14 forks source link

Choice of standard optimization method for all models #119

Open pschil opened 4 years ago

pschil commented 4 years ago

I'd suggest to change the standard optimization method to Nelder-Mead for all models. BFGS breaks if the LL returns NA/Inf due to numerical stability issues which seem to arise rather frequently in practice. The users then receive the message "Estimation failed with NA coefs" from which it is not immediately clear what went wrong and hence it also it not clear what needs to has to be adapted. I also think this outweights the benefits of BFGS, namely faster convergence. @bachmannpatrick whats your take on this?

mmeierer commented 4 years ago

In future, we will aim to first try to optimize using BFGS and then - if no global solution can be found - fall back to Nelder-Mead. We need to write a wrapper for this.

oblander commented 3 years ago

Usually the source of the issue is that Newton-type methods like BFGS and nlm perform a line search in the direction of the gradient, which can lead to extreme parameter values in the line search step which cause NAs/Infs. The same problem also arises when heterogeneity is rejected by the model, since the mixing distribution scale parameters diverge to infinity during estimation.

A simple fix is to use tryCatch and is.na to replace errors and Infs with very large (but finite) numbers, such as -1E20, that are several orders of magnitude larger than we expect the LL to be for non-problematic parameter values.

This fixes the vast majority of errors because the very bad (but finite) LL values for problematic parameter values prevents the optimization procedure from taking large steps, instead of causing the whole procedure to error out. When heterogeneity is rejected, this implicitly constrains the parameters from diverging to infinity, and optimization will stop at the largest values that still result in a numerically stable LL (however in this case the computed SEs will be meaningless, as the procedure has not converged to a stationary point).

Another measure that can help prevent numerical errors is constraining the maximum allowable step size in the line search, such as with the stepmax argument in nlm.