leeping / geomeTRIC

Geometry optimization code that includes the TRIC coordinate system
https://geometric.readthedocs.io/
Other
154 stars 66 forks source link

Automatically assigned tmin may become too small #180

Open annulen opened 12 months ago

annulen commented 12 months ago

I'm using drms 6e-4 dmax 1e-3 criteria for optimization (which were taken from ORCA's TightOpt), and I've got into situation where optimization got stuck in endless loop with energy slowly climbing up. Quailty was always < 0 but no steps were rejected with reasoning Not rejecting step - trust below tmin = 6.000e-05.

Unreasonably low tmin caught my attention (btw, I'm using tip of master branch). Corresponding code in params.py says that

  1. For DFT tmin should actually be not smaller than 1.0e-4 because of gradient errors.
  2. tmin should be smaller than Convergence_drms to avoid rejection of valid steps.

Both criteria look sensible to me. However, code that computes tmin is written as

self.tmin = kwargs.get('tmin', min(1.0e-4, self.Convergence_drms*0.1))

which explicitly contradicts criterion (1), as tmin can easily drop below 1e-4.

I think proper default value of tmin should better be chosen as

min(max(1.0e-4, self.Convergence_drms*0.1), self.Convergence_drms)

With this logic tmin may only become smaller than 1.0e-4 when Convergence_drms < 1.0e-4, but still be limited by Convergence_drms*0.1 from above for large values of Convergence_drms. For my case, it gives tmin = 1.0e-4.

I can make a PR if you think this is the right way.

annulen commented 11 months ago

For energy minimizations, RFO and QN are very similar and with a reasonable default parameter choice for RFO, they are equivalent (see attached paper).

However, it seems that in ORCA implementation there is a difference between them. Here is a quote from manual:

In any case, each individual element of ∆q is restricted to magnitude MaxStep and the total length of the step is restricted to Trust. In the RFO case, this is achieved by minimizing the predicted energy on the hypersphere of radius Trust which also modifies the direction of the step while in the quasi-Newton step, the step vector is simply scaled down.

In other words, when step is larger than trust radius, QN step is simply scaled down while RFO step is optimized within radius to get lowest predicted energy.

leeping commented 11 months ago

In QN it is also possible to minimize the energy on the hypersphere of radius Trust; that is what geomeTRIC does. I am not sure why the QN optimizer in ORCA decides to scale the step instead.

corinwagen commented 1 month ago

hey folks, is there an accepted solution to these situations? reading through this, I'm not 100% sure what to do (I'm in a similar situation here)