mcocdawc / chemcoord

A python module for manipulating cartesian and internal coordinates.
GNU Lesser General Public License v3.0
71 stars 19 forks source link

get_zmat generates non-existent dihedral angles (version 2.0.4) #62

Closed em819 closed 4 years ago

em819 commented 4 years ago

Code Sample, a copy-pastable example if possible

#!/usr/bin/env python

import chemcoord as cc
import time
import sys

conformer=sys.argv[1]
try:
   print (cc.Cartesian.read_xyz(conformer, start_index=1).get_fragment())
   conformer_zmat=cc.Cartesian.read_xyz(conformer, start_index=1).get_zmat()
   print(conformer_zmat)
except (IndexError, cc.exceptions.UndefinedCoordinateSystem) as e:
   pass

Problem description

The application of the script above to a file containing the following xyz-content

   10

N      -1.57915021  -0.06132253  -0.34309454  
C      -0.47413435   0.40060129   0.44221961  
C       0.90760183  -0.17717667  -0.07552933  
O       1.00295805  -0.91163233  -0.98105044  
H      -1.61819529   0.18016030  -1.33279056  
H      -1.76810071  -1.06199213  -0.39151637  
H      -0.51682835   1.51565339   0.38772369  
H      -0.71687560   0.06261688   1.47754477  
O       2.01621600   0.26483846   0.64984337  
H       2.74650862  -0.21174666   0.16664981

generates the following z-matrix

atom b bond a angle d dihedral 2 C origin 0.762131 e_z 54.532484 e_x 139.805168 1 N 2 1.432184 e_z 130.544846 e_x -105.606281 3 C 2 1.584641 1 112.086795 e_x 4.245894 8 H 2 1.115820 1 104.077378 3 121.818309 7 H 2 1.117199 1 105.407283 3 -122.592496 5 H 1 1.019479 2 119.043062 3 62.028264 6 H 1 1.019503 2 119.052699 3 -60.864151 9 O 3 1.396627 2 114.019681 1 -179.053674 4 O 3 1.169823 2 123.566338 1 1.134983 10 H 9 0.996964 3 100.278990 2 -179.778371

The dihedral angle in the bold line corresponds to a dihedral angle of the form H-C-N-C which does not exist in the molecule given by the xyz-file. In this specific case it seems that during generation of the construction table the connection of the nitrogen atom the connection to the distant (2.5 angstrom) C-atom is chosen instead of the directly bonding H-atoms.

Expected Output

Output of cc.show_versions()

arifin-chemist89 commented 4 years ago

As my understanding regarding the Z-matrix, there is no problem with the H8-C2-N1-C3 dihedral angle. However, my concern in the example above is the index on the most-left column, where the bigger index atoms sometime come first. For example, the two top row in the Z-matrix: 2 C origin 0.762131 e_z 54.532484 e_x 139.805168 1 N 2 1.432184 e_z 130.544846 e_x -105.606281. I think this is critical because if there is no left index, the N atom is referring to atom number 2 which is itself, which makes this Z-matrix is invalid.

mcocdawc commented 4 years ago

@arifin-chemist89

I think this is critical because if there is no left index, the N atom is referring to atom number 2 which is itself, which makes this Z-matrix is invalid.

Indeed, if there was no explicit index the Z-matrix would be invalid. But since there is an explicit index, the Z-matrix is valid. FYI: Since there are a lot of programs, who use the row number implicitly as an index, I wrote the Zmat.change_numbering function, that switches to a row number based index if no argument is passed. (The b, a, d columns are changed accordingly.)

mcocdawc commented 4 years ago

@em819

Thank you for your bug-report and sorry for needing so long to reply. The bad news is, that is not really a bug in the implementation, but a conceptual problem in the algorithm. There had to be made a decision between doing depth-first search or breadth-first search when looking for new atoms in the Z-matrix and I went for breadth-first search. Chemically speaking if atom i is included in the Z-matrix, I first add the complete 1st coordination sphere of atom i to the Z-matrix before going to the 2nd coordination sphere. Usually this produces better results, but to a human chemist this example looks indeed wrong.. It would be desirable to dynamically detect such cases and change the searching strategy, but I have difficulties to put that into a fixed set of rules. So for the time being, I can only say, that you have have to construct the construction table for the first four atoms along the chain yourself.

Just to clarify a few things:

H-C-N-C which does not exist in the molecule given by the xyz-file.

If you forget chemistry and treat lengths, angles and dihedrals as spherical coordinates in the local coordinate system of each atom, any dihedral you want exist. The generated Z-matrix is still well defined and can be converted back and forth to cartesian coordinates. But: Of course it is chemically desirable to select the dihedral defining atom in the 3rd coordination sphere of an atom.