Open annulen opened 12 months ago
For energy minimizations, RFO and QN are very similar and with a reasonable default parameter choice for RFO, they are equivalent (see attached paper).
However, it seems that in ORCA implementation there is a difference between them. Here is a quote from manual:
In any case, each individual element of ∆q is restricted to magnitude MaxStep and the total length of the step is restricted to Trust. In the RFO case, this is achieved by minimizing the predicted energy on the hypersphere of radius Trust which also modifies the direction of the step while in the quasi-Newton step, the step vector is simply scaled down.
In other words, when step is larger than trust radius, QN step is simply scaled down while RFO step is optimized within radius to get lowest predicted energy.
In QN it is also possible to minimize the energy on the hypersphere of radius Trust; that is what geomeTRIC does. I am not sure why the QN optimizer in ORCA decides to scale the step instead.
hey folks, is there an accepted solution to these situations? reading through this, I'm not 100% sure what to do (I'm in a similar situation here)
I'm using
drms 6e-4 dmax 1e-3
criteria for optimization (which were taken from ORCA'sTightOpt
), and I've got into situation where optimization got stuck in endless loop with energy slowly climbing up. Quailty was always < 0 but no steps were rejected with reasoningNot rejecting step - trust below tmin = 6.000e-05
.Unreasonably low
tmin
caught my attention (btw, I'm using tip of master branch). Corresponding code inparams.py
says thattmin
should actually be not smaller than1.0e-4
because of gradient errors.tmin
should be smaller thanConvergence_drms
to avoid rejection of valid steps.Both criteria look sensible to me. However, code that computes
tmin
is written aswhich explicitly contradicts criterion (1), as
tmin
can easily drop below1e-4
.I think proper default value of
tmin
should better be chosen asWith this logic
tmin
may only become smaller than1.0e-4
whenConvergence_drms < 1.0e-4
, but still be limited byConvergence_drms*0.1
from above for large values ofConvergence_drms
. For my case, it givestmin = 1.0e-4
.I can make a PR if you think this is the right way.