leeping / geomeTRIC

Geometry optimization code that includes the TRIC coordinate system
https://geometric.readthedocs.io/
Other
152 stars 65 forks source link

ORCA converges in 8 steps, geomeTRIC gets stuck #186

Open annulen opened 10 months ago

annulen commented 10 months ago

Here are two minimizations, the former performed with built-in optimizer of ORCA 5.0.4 (redundant internal coordinates, RFO) and the latter by geomeTRIC (unmodified master +
0001-Okay-patch.patch.txt). This is the same system as in #180, but high-quality Hessian (B3LYP/6-31G**) was used from start to the end to aid convergence.

defgrid3 nofinalgridx was used in both cases to improve quality of gradients as compared to default settings.

In case of geomeTRIC I've used --reset=False option, otherwise high-quality Hessian would be dropped right in the beginning because of negative eigenvalues created by Hessian update. I've also used --subfrctor 2 to avoid whole system rotations with intent to keep Hessian more relevant for the current geometry.

ORCA's inputs and outputs: PMDA_chair_dimer2_vdzp_orca.tar.gz

geomeTRIC's command.sh, inputs and outputs: PMDA_chair_dimer2_geometric.tar.gz

annulen commented 10 months ago

I've also run ORCA optimization with QN, it converged in 7 steps: PMDA_chair_dimer2_vdzp_orca_qn.tar.gz

GDIIS took 8 steps like RFO.

So I suspect that the culprit may be incorrect handling of Hessian in geomeTRIC. In particular, I see that geomeTRIC by default gets rid of negative eigenvalues by regenerating guess Hessian, while ORCA continues despite having negative eigenvalue in Hessian.

leeping commented 10 months ago

Thanks for sharing this. I agree geomeTRIC is not behaving properly in this calculation when you have provided a high-quality Hessian in the beginning. I would expect it to take several good steps if you have an exact initial Hessian, but it seems that Step 2 is already low quality. I am curious what happens if you restrict the step using --tmax 0.03 or something like that?

I am seeing one other difference. In the beginning of the optimization, it appears several Hessian eigenvalues are negative:

Step    0 : Gradient = 3.257e-03/7.213e-03 (rms/max) Energy = -122.0432564665
Hessian Eigenvalues: -5.67264e-03 -4.86780e-03 -3.65937e-03 ... 4.11571e-01 4.20647e-01 4.20864e-01
Step    1 : Displace = 3.618e-02/5.932e-02 (rms/max) Trust = 1.000e-01 (=) Grad = 1.728e-04/4.523e-04 (rms/max) E (change) = -122.0441496198 (-8.932e-04) Quality = 0.855
Hessian Eigenvalues: -7.65921e-03 -4.89374e-03 -3.70047e-03 ... 4.11227e-01 4.19520e-01 4.20647e-01
Step    2 : Displace = 1.427e-01/2.310e-01 (rms/max) Trust = 1.414e-01 (+) Grad = 3.257e-03/7.006e-03 (rms/max) E (change) = -122.0431869362 (+9.627e-04) Quality = -0.219
Hessian Eigenvalues: -4.80642e-03 -3.69764e-03 -2.78662e-03 ... 4.17171e-01 4.20647e-01 5.33110e-01

However, this is not the behavior seen in ORCA as there is only one negative eigenvalue. There may be an issue with conversion of the Hessian into internal coordinates. If that is true, it's a more serious problem that needs fixing.

I think this will require some time for me to diagnose, thanks a lot for sharing the data.

annulen commented 10 months ago

I am curious what happens if you restrict the step using --tmax 0.03 or something like that?

Will check it now. BTW, at least in ORCA there is a clear separation between maximum step and trust radius. The former is a maximum for changes of individual internal coordinates, while trust radius limits length of step vector. Both limits can be set separately, though I didn't have much success with tight limits on MaxStep (I guess because it distorts step direction too much)

annulen commented 10 months ago

With --tmax 0.03 minimization converges, but only after step 86. Initial steps look much better, but later it begins to struggle. PMDA_chair_dimer2_tmax.tar.gz

leeping commented 10 months ago

An issue I have observed in the past that could be related to what you're seeing is that sometimes the BFGS Hessian update results in the appearance of negative eigenvalues in the Hessian matrix. I think BFGS is supposed to preserve positive definiteness so there may be something going on there.

If you could also let me know if geomeTRIC performs differently using a different coordinate system (try --coordsys cart or --coordsys dlc that would be helpful. Meanwhile I will work on getting my local environment set up to use ASE / ORCA as the engine.

annulen commented 10 months ago

An issue I have observed in the past that could be related to what you're seeing is that sometimes the BFGS Hessian update results in the appearance of negative eigenvalues in the Hessian matrix.

FWIW I see this happening in ORCA too, both with model and pre-computed hessians

annulen commented 10 months ago

Interestingly, with --coordsys cart all eigenvalues on Step 0 are positive, while with --coordsys dlc there are three negatives similarly to TRIC --coordsys cart

Step    0 : Gradient = 3.257e-03/7.213e-03 (rms/max) Energy = -122.0432564665
Hessian Eigenvalues: 1.09285e-08 1.41624e-08 3.06648e-08 ... 9.28710e-01 1.04992e+00 1.05461e+00

--coordsys dlc:

Step    0 : Gradient = 3.257e-03/7.213e-03 (rms/max) Energy = -122.0432564665
Hessian Eigenvalues: -2.86597e-02 -8.31328e-03 -5.53354e-03 ... 4.20077e-01 4.20665e-01 4.30463e-01
annulen commented 10 months ago

Results: Cartesian converged after step 100, DLC — after step 20. PMDA_chair_dimer2_cart.tar.gz PMDA_chair_dimer2_dlc.tar.gz

annulen commented 10 months ago

I've re-run DLC without --tmax 0.03, it managed to converge in 18 steps, however convergence was very shaky with lots of missteps and quality reaching -25.

PMDA_chair_dimer2_dlc_2.tar.gz

annulen commented 10 months ago

An issue I have observed in the past that could be related to what you're seeing is that sometimes the BFGS Hessian update results in the appearance of negative eigenvalues in the Hessian matrix. I think BFGS is supposed to preserve positive definiteness so there may be something going on there.

I've found this in manual (https://geometric.readthedocs.io/en/latest/how-it-works.html#bfgs-hessian-update):

A possible way to improve performance could be to modify the optimizer to perform a line search in cases where BFGS does not return a positive definite Hessian matrix, until the Wolfe conditions are satisfied.

So, it looks like you were already aware of this kind of issue.

leeping commented 10 months ago

Thanks a lot for this data. I think it's a clear example of a need for improving the optimization performance. I previously did not run into such an issue because I don't usually have an exact Hessian at the start of an energy minimization. It might take me a few weeks to resolve the problem, since it would require setting up the environment on my side, (possibly) making a test case that runs in a short time, and implementing the improvements themselves; the end of the term is getting pretty busy over here with graduate admissions and final exams, but this is important.

annulen commented 10 months ago

Okay, no problem. Thanks!

leeping commented 10 months ago

I have a few ideas for improving the performance or diagnosing problems for your job (or more generally, energy minimizations that have an exact Hessian at the start). You might be able to try some of these ideas before I can get to them.

1) Try running a job with --fdcheck yes. I'm suspecting that there could be a problem with the internal coordinate Hessian for this molecule that is not there for other molecules. It runs a total of four checks, and if there are any problems, the check_internal_hess function should print out significant differences between the analytic and finite-difference Hessian (say greater than 1e-3). 2) Try using the get_hessian_update_tsbfgs or the get_hessian_update_msp functions to get the Hessian updates. They are intended for TS optimization, and are designed to be used with Hessians that are not positive definite.
3) Try using get_delta_prime_rfo in place of get_delta_prime_trm for the optimization step. This is not very likely to succeed because I haven't extensively tested get_delta_prime_rfo and it might not be correct. This is closer to what the transition state optimization uses, and I have not had serious problems with providing an exact initial Hessian for TS optimizations. 4) Try to reproduce the problem using a very inexpensive level of theory, perhaps even using ASE/XTB, that would help with diagnosing and resolving the problem a lot faster.

Let me know if you're interested in trying these, otherwise, I should have more time by the end of this week to get to them.

annulen commented 10 months ago

Trying --fdcheck yes with --coordsys dlc and no tmax now (as it was the quickest convergence but still obviously flawed).

or more generally, energy minimizations that have an exact Hessian at the start

To be fair my Hessian is not "exact" in strict sense. This molecule was optimized at B3LYP/6-31G** level, and starting geometry and Hessian are taken from there. In my understanding Hessian would be "exact" if it was computed at the same level of theory on starting geometry, in that case it would also likely have big negative eigenvalues with corresponding modes pointing towards minimum. This is not the case with my Hessian which is positively defined initially.

annulen commented 10 months ago

I've tried xtb (GFN2) using ORCA (with the same optimizer settings as before) and geomeTRIC (with dlc coordinates) as optimizers with the same starting geometry and Hessian. As a result, geomeTRIC has converged faster: 31 step vs 43 steps. geomeTRIC_xtb.tar.gz orca_xtb.tar.gz

annulen commented 10 months ago

Does fdcheck actually use computed gradients at "auxiliary" points? If it could calculate only energies at those points it would be 1.5–2 times faster.

leeping commented 10 months ago

Correct, at the moment geomeTRIC can only request gradients from external codes. In the next version I plan on adding the ability to request energies and analytic Hessians (if available).

annulen commented 10 months ago

Here are results of fdcheck/dlc run: fdcheck.tar.gz

annulen commented 10 months ago

Trying tsbfgs with this patch now:

diff --git a/geometric/step.py b/geometric/step.py
index d8c755d..0548648 100644
--- a/geometric/step.py
+++ b/geometric/step.py
@@ -357,7 +357,7 @@ def update_hessian(IC, H0, xyz_seq, gradx_seq, params, trust_limit=False, max_up
             if ts_bfgs: Hup = get_hessian_update_tsbfgs(Dy, Dg, H)
             else: Hup = get_hessian_update_msp(Dy, Dg, H, verbose=params.verbose)
         else:
-            Hup = get_hessian_update_bfgs(Dy, Dg, H)
+            Hup = get_hessian_update_tsbfgs(Dy, Dg, H)
         # Compute some diagnostics.
         if params.verbose:
             Eig = sorted_eigh(H, asc=True)[0]

Update: needed to add following patch to avoid errors when code attempts to access self.H:

diff --git a/geometric/step.py b/geometric/step.py
index d8c755d..8cf3424 100644
--- a/geometric/step.py
+++ b/geometric/step.py
@@ -278,8 +278,8 @@ def get_hessian_update_tsbfgs(Dy, Dg, H): # pragma: no cover
     # yk = col(self.G - self.Gprev)
     yk = Dg
     dk = Dy
-    jk = yk - np.dot(self.H, dk)
-    B = force_positive_definite(self.H)
+    jk = yk - np.dot(H, dk)
+    B = force_positive_definite(H)
     # Scalar 1: dk^T |Bk| dk
     s1 = multi_dot([dk.T, B, dk])
     # Scalar 2: (yk^T dk)^2 + (dk^T |Bk| dk)^2
annulen commented 10 months ago

Results for run with above patches: tsbfgs.tar.gz

It didn't converge after 81 steps (unlike get_hessian_update_bfgs version that converged after 18 steps, see https://github.com/leeping/geomeTRIC/issues/186#issuecomment-1817410336). Note that get_hessian_update_bfgs has 3 missteps early one. tsbfgs seems to be monotonic in the beginning, however it has missteps at 7 and 8, and since step 9 tsbfgs steps have worse energy

annulen commented 10 months ago

I've activated MSP updates with patch

diff --git a/geometric/step.py b/geometric/step.py
index d8c755d..b8dad9d 100644
--- a/geometric/step.py
+++ b/geometric/step.py
@@ -357,7 +357,7 @@ def update_hessian(IC, H0, xyz_seq, gradx_seq, params, trust_limit=False, max_up
             if ts_bfgs: Hup = get_hessian_update_tsbfgs(Dy, Dg, H)
             else: Hup = get_hessian_update_msp(Dy, Dg, H, verbose=params.verbose)
         else:
-            Hup = get_hessian_update_bfgs(Dy, Dg, H)
+            Hup = get_hessian_update_msp(Dy, Dg, H, verbose=params.verbose)
         # Compute some diagnostics.
         if params.verbose:
             Eig = sorted_eigh(H, asc=True)[0]

Result: converged in 10 steps which is much better than anything above! It's still worse than ORCA (7-8 steps, uses BFGS), also it features 3 missteps and ends with two steps with positive energy changes. msp.tar.gz

annulen commented 10 months ago

Interestingly, MSP at step 8 was really close by energy value to ORCA's final result (-122.0441832), however last two steps made energy higher (of course it doesn't say anything about geometrical proximity)

leeping commented 10 months ago

Thanks a lot. I think the cause of the bad performance could be one of the following:

1) Our BFGS implementation is unable to update a Hessian properly when there are negative eigenvalues. MSP works better, but it's mainly intended to be used for transition state optimizations. I need to do some research about what is supposed to happen during a BFGS update when the underlying PES is not positive definite.

2) The transformed Hessian has negative eigenvalues, which could indicate a problem with one of the ICs. The --fdcheck job prints out information in four blocks, and the second block might have the information I need. Could you share the output file you have so far?

annulen commented 10 months ago

Our BFGS implementation is unable to update a Hessian properly when there are negative eigenvalues. MSP works better, but it's mainly intended to be used for transition state optimizations. I need to do some research about what is supposed to happen during a BFGS update when the underlying PES is not positive definite.

If you are interested I can record Hessian updates made by ORCA after each step so you could compare their implementation of BFGS with yours.

Switching to MSP is indeed a bad idea for generic case. I've actually seen it slowing down convergence in a different case when I forgot that my geomeTRIC installation was patched.

The transformed Hessian has negative eigenvalues, which could indicate a problem with one of the ICs. The --fdcheck job prints out information in four blocks, and the second block might have the information I need. Could you share the output file you have so far?

I think my fdcheck.tar.gz I've attached above contained the complete log (as fdcheck/PMDA_chair_dimer2.log). Does it miss anything?

leeping commented 10 months ago

Sorry, I must have missed that you posted the log above. Thanks!

At the moment I don't think we need to check the Hessian update vs. Orca. The step size and gradient in Orca will probably be different from the one in geomeTRIC, due to differences in how the IC system is constructed. If you want to do this, you would need to pass Orca's Hessian, step, and gradient change into geomeTRIC's Hessian update function, and check that it gives the same update. However I don't think that's the most likely cause of the problem.

From reading the log, I can see the second derivs of internal coordinates w/r.t. Cartesian coordinates has good agreement with finite difference. However, the second derivs of the energy in internal coordinates has poor agreement with finite difference (see selected lines):

            IC1 Name             IC2 Name :       Analytic      Numerical       Abs-Diff
       Angle 8-19-28        Distance 3-11 :       0.007707      -0.000280       0.007987
        Distance 1-2        Distance 7-20 :       0.002636       0.000647       0.001989
       Angle 8-19-28        Angle 8-19-28 :       0.072896       0.111734      -0.038838
   Dihedral 6-1-8-19        Angle 8-19-28 :       0.001434       0.015125      -0.013691

Errors of this size are big enough to cause negative eigenvalues in the IC Hessian to appear when there shouldn't be any. There might be a bug in the construction of the IC Hessian but I think we need to try two more things to be sure:

  1. The finite difference test is currently doing the test on the primitive coordinate system. Could you try rerunning the test using the delocalized internal coordinate system? To do that, remove .Prims from these lines of code around line 965 in optimize.py:
    if fdcheck:
        IC.Prims.checkFiniteDifferenceGrad(coords)
        IC.Prims.checkFiniteDifferenceHess(coords)
        check_internal_grad(coords, M, IC.Prims, engine, dirname, verbose)
        check_internal_hess(coords, M, IC.Prims, engine, dirname, verbose)
        return
  1. It would be good if the code prints out the eigenvalues of the finite difference internal coordinate Hessian (Hq_f). Could you try that? (It would involve adding some print statements around line 120 of ic_tools.py).

Thanks, and sorry for the trouble.

annulen commented 10 months ago

I've made another run with DLC, regular BFGS update, and --check 1. Results are here: dlc_check.tar.gz Convergence is much more smooth than without check. Yes, it takes 19 steps instead of 18, but it converges to a lower energy, and there are much less missteps and no horrors like -25 quality.

annulen commented 10 months ago

My current hypothesis is that initial ICs are suboptimal and quickly become unsuitable for stable calculations, and that instability is causing missteps and re-appearance of negative Hessian eigenvalues. However, it still doesn't explain overall slow convergence in comparison to ORCA.

annulen commented 10 months ago

Running fdcheck with the following patch now:

diff --git a/geometric/ic_tools.py b/geometric/ic_tools.py
index e1bcf8d..00fbe55 100644
--- a/geometric/ic_tools.py
+++ b/geometric/ic_tools.py
@@ -126,6 +126,10 @@ def check_internal_hess(coords, molecule, IC, engine, dirname, verbose=0):
     logger.info("Hessian Eigenvalues (Internal):\n")
     for i in range(len(Eigq)):
         logger.info("% 10.5f %s" % (Eigq[i], "\n" if i%9 == 8 else ""))
+    Eigqf = sorted(np.linalg.eigh(Hq_f)[0])
+    logger.info("Hessian Eigenvalues (Hq_f):\n")
+    for i in range(len(Eigqf)):
+        logger.info("% 10.5f %s" % (Eigqf[i], "\n" if i%9 == 8 else ""))
     return Hq, Hq_f

 def write_displacements(coords, M, IC, dirname, ic_select="all", displace_range=(-0.3, 0.3, 7), verbose=0):
diff --git a/geometric/optimize.py b/geometric/optimize.py
index 73d2733..c90dc17 100644
--- a/geometric/optimize.py
+++ b/geometric/optimize.py
@@ -959,10 +959,10 @@ def run_optimizer(**kwargs):

     fdcheck = kwargs.get('fdcheck', False) # Check internal coordinate gradients using finite difference..
     if fdcheck:
-        IC.Prims.checkFiniteDifferenceGrad(coords)
-        IC.Prims.checkFiniteDifferenceHess(coords)
-        check_internal_grad(coords, M, IC.Prims, engine, dirname, verbose)
-        check_internal_hess(coords, M, IC.Prims, engine, dirname, verbose)
+        IC.checkFiniteDifferenceGrad(coords)
+        IC.checkFiniteDifferenceHess(coords)
+        check_internal_grad(coords, M, IC, engine, dirname, verbose)
+        check_internal_hess(coords, M, IC, engine, dirname, verbose)
         return

     # Print out information about the coordinate system
annulen commented 10 months ago

Results of fdcheck with patch above: fdcheck2.tar.gz

leeping commented 10 months ago

Thanks a lot. From looking at your latest test, I can see a potential major issue with the IC Hessian.

Here are lowest four eigenvalues of the IC Hessian generated by converting the Cartesian Hessian. In normal operation, geomeTRIC calculates the IC Hessian by converting a Cartesian Hessian that is either generated by finite-difference on the Cartesian gradient or supplied by the user. (Note that fdcheck always calculates the Cartesian Hessian so if you provided a Hessian file it is not used).

Hessian Eigenvalues (Internal):
  -0.01015   -0.00244    0.00027    0.00127

Here are the lowest four eigenvalues of the IC Hessian generated using finite-difference on the IC gradient.

Hessian Eigenvalues (Hq_f):
   0.00046    0.00359    0.00396    0.00403

This indicates there may be something wrong with the conversion of the Cartesian Hessian into the IC Hessian, i.e. this function: https://github.com/leeping/geomeTRIC/blob/master/geometric/internal.py#L1986

That function is supposed to equation 6 in the attached paper for converting the Cartesian Hessian to the IC Hessian. That equation requires calculating $\mathbf{B}^\prime$, the second derivatives of the ICs w/r.t. Cartesians. The second of four tests in fdcheck confirms that $\mathbf{B}^\prime$ is correct. However, the fourth test is failing which indicates to me that $\mathbf{H}_q$ is not being assembled correctly. The mistake might be something like a matrix that was transposed when it shouldn't be.

To resolve the issue, it would be nice to try running the fdcheck with the above patch using an optimized geometry with relatively tight criteria (say gmax ~ 1e-5). If there is still significant disagreement in the Hessian, then we know the problem isn't solely due to the second term in parentheses in Eq. 6, i.e. it could be the matrix multiplications "outside" the parentheses.

I will also install ORCA to attempt to reproduce the issue on my side.

image

1996-Schlegel-RedundantIC-GeoTS-JCC.pdf

annulen commented 10 months ago

I can think of a different approach — make a unit test for each part of calculation to ensure that all operations produce correct results.

leeping commented 10 months ago

Some tests are located in https://github.com/leeping/geomeTRIC/blob/master/geometric/tests/test_hessian.py but evidently they did not catch this problem. One problem that complicates the standard unit test approach is that there are always numerical precision errors in comparing the finite difference vs. analytic gradient. For example, the fdcheck shows plenty of errors on the order of 1e-5 for the IC gradient, but I don't think that is a problem; on the other hand, I believe the appearance of negative eigenvalues in the IC Hessian is a real problem.

I think to find the bug, we need to first determine if there is a problem in the InternalCoordinates.calcHess() function (https://github.com/leeping/geomeTRIC/blob/master/geometric/internal.py#L1986). If that is indeed the source of the problem, we can fix the bug and get a better agreement between "the IC Hessian converted from Cartesians" and "the IC Hessian computed by finite difference". Next we can create a unit test to ensure it always produces the correct result.

annulen commented 10 months ago

One problem that complicates the standard unit test approach is that there are always numerical precision errors in comparing the finite difference vs. analytic gradient. For example, the fdcheck shows plenty of errors on the order of 1e-5 for the IC gradient, but I don't think that is a problem

FWIW that's what numpy.testing.assert_array_almost_equal is designed for. By default it compares arrays with precision 1e-6, but it can be configured.

leeping commented 10 months ago

Okay, I see what you mean. I think the test of the IC Hessian should pass if the "absolute differences" in the final column (below) are less than 1e-4, maybe 1e-3 (I'm not sure on the absolute threshold to use). The differences are certainly a lot bigger right now.

-=# Now checking Hessian of the energy in internal coordinates using finite difference on gradient #=-
            IC1 Name             IC2 Name :       Analytic      Numerical       Abs-Diff
               DLC 1                DLC 1 :       0.082989       0.163078      -0.080088
               DLC 1                DLC 2 :      -0.049717      -0.041980      -0.007737
               DLC 1                DLC 3 :       0.022504       0.007129       0.015374
               DLC 1                DLC 4 :       0.017262       0.013478       0.003784
               DLC 1                DLC 5 :      -0.024098      -0.020622      -0.003476
               DLC 1                DLC 6 :       0.043281       0.039945       0.003335
               DLC 1                DLC 7 :       0.012610       0.010526       0.002084
               DLC 1                DLC 8 :       0.015706       0.014942       0.000764
               DLC 1                DLC 9 :      -0.008275      -0.009837       0.001561
annulen commented 10 months ago

Results of fdcheck at minimum point: fdcheck_minimum.tar.gz

leeping commented 9 months ago

Thanks a lot. There is still a significant error. I will spend some time on reproducing the issue here and attempt a fix.

leeping commented 9 months ago

I've gotten your example to run on my server. I'd like to know whether it's possible to increase the accuracy of the gradient ORCA further, in order to eliminate other possible issues. The output below indicates a gradient error on the order of 1e-4, and I'd like to make it one order of magnitude lower. Could you suggest which parameters I should adjust? (I'm not an expert ORCA user.)

-=# Now checking gradient of the energy in internal coordinates vs. finite difference #=-
             IC Name :       Analytic      Numerical       Abs-Diff
               DLC 1 :       0.000020      -0.000093       0.000112
annulen commented 9 months ago

I'd like to know whether it's possible to increase the accuracy of the gradient ORCA further, in order to eliminate other possible issues.

It's possible to disable RIJCOSX by replacing defgrid3 nofinalgridx with NORI NOCOSX (letter case doesn't matter btw). However, it will behave like a different theory level, i.e. slightly different energy, slightly different minimum point, and so on.

Are you using ORCA 5.0.4? Manual says about version 5 that

The COSX gradient has been strongly improved as well, resulting in an algorithm that is significantly faster and numerically more precise

BTW, it would be great if geomeTRIC could work correctly with lower quality gradients. There are other QC methods which can introduce gradient noise, e.g. solvation models, and there are methods which don't have analytical gradients at all. I think it could make sense to add debugging option to geomeTRIC that would add pseudorandom noise vector of chosen norm to gradient, and use it to ensure that it still can converge reasonably.

leeping commented 9 months ago

Thanks for the tip! I'm using ORCA 5.0.1 here, the version is older than 5.0.4 because I installed it a while ago.

I agree it would be great if geomeTRIC could converge reliably with low quality gradients but I am not aware of any software or code designed to do that. The underlying assumption of any gradient-based optimization code (that I know about) is that the gradient is "correct". There might be examples of optimization methods that use noisy gradients in the literature, if you or anyone could suggest some, I can look into it. It would be a major feature though.

annulen commented 9 months ago

I'm using ORCA 5.0.1 here, the version is older than 5.0.4 because I installed it a while ago.

I guess it should be fine. IIRC the only major thing fixed recently was bug in D4, which is not used here.

I agree it would be great if geomeTRIC could converge reliably with low quality gradients but I am not aware of any software or code designed to do that.

However it seems like ORCA can cope well with RIJCOSX gradients (at least for small molecules like this one). Maybe it's because of better Hessian update, or maybe because of some fine tuning of RFO parameters.

leeping commented 9 months ago

Those are good points. If there is a bug in the Hessian conversion or the Hessian update, it will cause convergence issues whether the gradient is exact or not, so the current priority is to find whether there's a bug and fix it. Tuning parameters could also improve performance, but one needs to tune them in a way that improves performance for the currently-bad cases like this one without worsening performance for other cases where it's working well. We can revisit that later.

In the meantime, I got this result when comparing Hessian eigenvalues. Somehow, there isn't a significant difference in the smallest eigenvalues on my end. It could be because I was running on a different branch (interpolate). I will switch back to the master branch and try again. Some selected values are reproduced below.

Hessian Eigenvalues (Cartesian):
  -0.00009   -0.00005   -0.00000    0.00001    0.00008    0.00010    0.00017    0.00021    0.00034 
Hessian Eigenvalues (Internal):
   0.00057    0.00282    0.00378    0.00400    0.00406    0.00428    0.00482    0.00490    0.00563 
Hessian Eigenvalues (Hq_f):
  -0.00033    0.00102    0.00241    0.00289    0.00331    0.00364    0.00434    0.00492    0.00566