zadorlab / sella

A Python software package for saddle point optimization and minimization of atomic systems.
https://www.ecc-project.org/
Other
72 stars 21 forks source link

Internal coordinates issue for coordination complexes with unbound ligands #40

Closed samblau closed 5 months ago

samblau commented 5 months ago

I am currently using Sella to run tens of thousands of optimizations of metal-organic coordination complexes. in ~0.8% of cases, we see an error like this during the initialization of the internal coordinates:

Traceback (most recent call last):
  File "/global/cfs/cdirs/m4119/quacc_test/reprod_sella_error/test.py", line 34, in <module>
    dyn = Sella(test_atoms, internal=True, order=0)
  File "/global/homes/s/sblau/miniconda3/envs/quacc/lib/python3.10/site-packages/sella/optimize/optimize.py", line 84, in __init__
    self.initialize_pes(
  File "/global/homes/s/sblau/miniconda3/envs/quacc/lib/python3.10/site-packages/sella/optimize/optimize.py", line 186, in initialize_pes
    self.pes = InternalPES(
  File "/global/homes/s/sblau/miniconda3/envs/quacc/lib/python3.10/site-packages/sella/peswrapper.py", line 356, in __init__
    new_int.find_all_bonds()
  File "/global/homes/s/sblau/miniconda3/envs/quacc/lib/python3.10/site-packages/sella/internal.py", line 1396, in find_all_bonds
    c10y[i, nbonds[i]] = j
IndexError: index 20 is out of bounds for axis 1 with size 20`

This appears to occur for some structures where ligands are not obviously coordinated to the central metal atom. Since I apparently cannot attach XYZ files to this issue, I will provide one example structure here, which is a -2 singlet:

97

Cs 0.8069 -0.7816 -0.0384
C -0.8432 3.0645 1.9714
O -1.4727 2.0562 1.2207
C -2.6958 1.6709 1.7779
C -3.3611 0.6211 0.8858
O -2.8531 -0.6848 0.9676
C -2.9425 -1.3158 2.2265
C -1.5898 -1.4642 2.923
O -0.8883 -2.4962 2.2754
C 0.3468 -2.7726 2.887
H -1.463 3.9751 1.9841
H 0.1068 3.2632 1.4852
H -0.6577 2.7467 3.0023
H -3.3709 2.546 1.8456
H -2.5502 1.284 2.7977
H -3.2057 0.9137 -0.1543
H -4.4431 0.6126 1.1096
H -3.3221 -2.3284 2.0399
H -3.6476 -0.7859 2.8799
H -1.7538 -1.7222 3.983
H -1.0235 -0.5286 2.8812
H 0.1952 -3.3152 3.8333
H 0.9086 -1.8602 3.0908
H 0.9173 -3.3829 2.1937
C -0.808 6.3958 -1.7028
C -0.9962 5.1833 -2.5992
S -0.9068 3.5832 -1.6814
C 0.4483 2.7544 -2.5511
C 1.8224 3.3167 -2.2701
S 2.261 3.1636 -0.5131
C 2.5855 4.9303 -0.112
C 1.324 5.7635 0.0675
S 0.934 6.8345 -1.368
H -1.296 6.2449 -0.7411
H -1.2278 7.2837 -2.1785
H -0.2489 5.1575 -3.3887
H -1.9921 5.2303 -3.0443
H 0.2198 2.7833 -3.6206
H 0.4126 1.7235 -2.2028
H 1.9146 4.3645 -2.5572
H 2.5653 2.7132 -2.7942
H 3.2486 5.35 -0.8695
H 3.1242 4.8745 0.8326
H 1.4224 6.4321 0.9238
H 0.4623 5.1172 0.2118
O 4.0188 0.3985 -0.0002
P 3.5875 0.4499 -1.5443
O 4.3526 1.4449 -2.32
O 2.0955 0.3772 -1.6563
O 4.1671 -1.0762 -2.0156
P 3.3183 -2.2911 -1.413
O 4.4385 -3.4824 -1.5137
O 2.2485 -2.6594 -2.4156
O 2.9535 -2.0965 0.0095
H 3.6539 -0.4891 0.165
H 4.5573 -3.6477 -2.4475
N -2.9188 -0.5133 -7.0521
C -2.8241 -0.2577 -5.6956
C -2.3678 -1.2248 -4.7979
C -2.2506 -0.8946 -3.4577
N -2.56 0.2947 -2.9676
C -2.9663 1.2175 -3.8211
C -3.1225 1.0047 -5.1783
H -2.9518 -1.4848 -7.317
H -3.534 0.0913 -7.575
H -2.1129 -2.2161 -5.143
H -1.9094 -1.6279 -2.7403
H -3.1927 2.1828 -3.387
H -3.4667 1.7946 -5.8335
Cl 3.5146 -0.0899 4.0098
Si 1.95 0.8927 2.9923
Cl 0.4507 0.9149 4.9178
Cl 2.598 2.8941 3.3791
O 0.1543 -3.0423 -0.5744
S 0.3188 -3.5289 -1.9335
C -1.2446 -4.5378 -2.1336
C -1.413 -5.3608 -3.236
C -2.651 -5.909 -3.5209
C -3.7359 -5.6202 -2.706
C -3.5712 -4.805 -1.5968
C -2.3276 -4.268 -1.3122
C 1.2573 -5.0677 -1.7825
C 1.31 -5.7137 -0.5634
C 2.0348 -6.8858 -0.435
C 2.6923 -7.418 -1.5332
C 2.6497 -6.7565 -2.7502
C 1.9342 -5.5793 -2.8726
H -0.5615 -5.5649 -3.8705
H -2.7727 -6.5539 -4.3835
H -4.7078 -6.035 -2.9405
H -4.4179 -4.5784 -0.9609
H -2.1719 -3.6182 -0.4628
H 0.7854 -5.2768 0.2719
H 2.0925 -7.3865 0.5232
H 3.2536 -8.3409 -1.4372
H 3.1881 -7.1527 -3.6001
H 1.9196 -5.0241 -3.7969

Obviously I do not suggest using LennardJones for this type of system, but here is code that can reproduce the error quickly and easily reading in this structure:

from sella import Sella
from ase.calculators.lj import LennardJones
test_atoms = read("test.xyz")
test_atoms.calc = LennardJones()
dyn = Sella(test_atoms, internal=True, order=0)
dyn.run()

I could easily provide at least fifty other example structures if it would be helpful for debugging.

samblau commented 5 months ago

FYI @ehermes @mgt16-LANL @Andrew-S-Rosen

juditzador commented 5 months ago

This is not fixing the code, but one way around this, if indeed the problem is the unconnected part, is that you detect that the bond graph is disconnected before running Sella. You can get all internals by internals = Internals(myatoms) and then add missing connections either as bonds or by other types of coordinates - this could be automated then. Did you try something like this?

juditzador commented 5 months ago

Here adding manually a bond to connect a fragment to the central atom (1, 61) - it doesn't throw an error anymore, the optimization starts. Nevertheless, I understand that the code should in principle do this automatically, and it does by the flood algorithm. I'll see if I can find the bug.


from sella import Sella
from sella import Internals
from ase.calculators.lj import LennardJones
from ase.io import read
test_atoms = read("test.xyz")
internals = Internals(test_atoms)
internals.add_bond((1, 61))
internals.find_all_bonds()
internals.find_all_angles()
internals.find_all_dihedrals()
test_atoms.calc = LennardJones()
dyn = Sella(test_atoms, internal=internals, order=0)
dyn.run()
juditzador commented 5 months ago

I think I know what is going on.

    def find_all_bonds(
        self,
        nbond_cart_thr: int = 6,
        max_bonds: int = 20,
    ) -> None:

sets the number of bonds to be max 20. I'd say that's not a very strict limit, but...

If I run the above code with three modifications:

  1. Do not add manually that bond
  2. Allow for fragments
  3. Print the bonds i.e.:
    from sella import Sella
    from sella import Internals
    from ase.calculators.lj import LennardJones
    from ase.io import read
    test_atoms = read("test.xyz")
    internals = Internals(test_atoms, allow_fragments=True)
    internals.find_all_bonds()
    for bond in internals.internals['bonds']:
    print(bond)
    internals.find_all_angles()
    internals.find_all_dihedrals()
    test_atoms.calc = LennardJones()
    dyn = Sella(test_atoms, internal=internals, order=0)
    dyn.run()

    The following things happen: (1)+(2) results in a warning only and the optimization runs:

    /home/jzador/.local/lib/python3.10/site-packages/sella/internal.py:1559: UserWarning: 291 coords found! Expected 285.

    (285 is 3*97-6)

And (3) reveals that there are indeed 19 bonds to the central atom already. It is because I guess this structure is not simply fragmented, but with the ballooning radius of the flood algorithm at trial radius R you have still fragments, while at the next trial radius, R * 1.05, you suddenly have more than 20 bonds. This explains why it happens only occasionally. If something is oriented slightly differently, the inclusion of the fragment atoms is more gradual, but sometimes you get unlucky.

OK, what to do. A possible bandaid is to say allow_fragments=True and generate the internals the way I do it. The optimization is still running, and the LJ is not supposed to be good anyway, so you need to see if this bandaid is good for now.

I would imagine that raising the threshold for max_bonds is a bad idea, from a nicely connected easy system now you create a densely connected one, where the internal coordinates lost their chemical meaning. The real solution is to fix this bug, i.e., sella could go more gentle in adding atoms. Of course, if you have a perfect sphere with 20 atoms on its surface, each far from each other, and there is a 21st atom in the middle, it'll still fail, but then maybe you are not doing chemistry - your system is not like that. OK, too many words here, I'll see if there is a good fix and consult with @ehermes if he has ideas.

juditzador commented 5 months ago

Another working version of the script:

from sella import Sella
from sella import Internals
from ase.calculators.lj import LennardJones
from ase.io import read
test_atoms = read("test.xyz")
internals = Internals(test_atoms)
internals.find_all_bonds(max_bonds=40)
for bond in internals.internals['bonds']:
    print(bond)
internals.find_all_angles()
internals.find_all_dihedrals()
test_atoms.calc = LennardJones()
dyn = Sella(test_atoms, internal=internals, order=0)
dyn.run()

Here increasing max_bonds to 40 (you only need 22 in fact), but not allowing fragments.