Closed wudihua closed 1 year ago
I am confused. How do you know that the cart=False version is wrong? That is in the Zmatrix format.
I am confused. How do you know that the cart=False version is wrong? That is in the Zmatrix format.
I use Gaussview to visualize the two files and they are different.
cart_output.gjf
output.gjf
Hi
For me the issue comes from the conversion into z-matrix and not from the Gaussian class. I cannot exactly say why but my guess is that your atoms are not well sorted and the conversion algorithm from cartesian coords to z-matrix fails. In your case successive atoms may be far away from each other. When the program tries to build the z-matrix it will consider distances, angles and dihedrals that do not correspond to "real" internal coordinate. As you have a lot of atoms, at some point, maybe you consider aligned atoms (or closed to be aligned), or another kind of wrong geometry for a z-matrix and from this point the geometry is wrong.
I sorted the atom of your nanotube according to the axis of the nanotube and at the end the final z-matrix is ok. Both gjf input files (in cartesian coordinates or z-matrix) are correctly written (see the picture).
Here is what I did to sort the atoms. It is not so elegant but to be quick I used pandas sorting facilities.
import pymatgen as mg
import numpy as np
import pandas as pd
# input.xyz is the xyz file you put in your message
mol = mg.Molecule.from_file("input.xyz")
# I rotated the nanotube around y axes to aligned the nanotube axis and the z-axis
# this is not exactly 30 degrees but it is not so far
op = mg.SymmOp.from_axis_angle_and_translation(axis=[0, 1, 0], angle=-30)
mol.apply_operation(op)
mol.to(fmt="xyz", filename="input2.xyz")
mol2 = mg.Molecule.from_file("input2.xyz")
coords = mol2.cart_coords
# geometric barycenter of the nanotube (without masses)
G = coords.sum(axis=0) / len(coords)
coords -= G
df = pd.DataFrame({"species": mol2.species, "x": coords[:, 0], "y": coords[:, 1], "z": coords[:, 2]})
# compute the radius of the nanotube
df["r"] = np.sqrt(df["x"]**2 + df["y"]**2)
# compute the angle theta with x axis.
df["theta"] = np.degrees(np.arctan2(df.y, df.x))
df["theta"] = np.where(df["theta"] < 0, 360 + df["theta"], df["theta"])
df.describe()
df.head()
# from now, you can use r, theta, z as cylindrical coordinates
# now the key part, I sort according first to z and second theta (maybe theta is not necessary
df.sort_values(by=["z", "theta"], ascending=True, inplace=True)
# set up the sorted molecule
coords = df[["x", "y", "z"]].values
m_sorted = mg.Molecule(species=df.species.values, coords=coords)
gau = mg.io.gaussian.GaussianInput(m_sorted)
gau.write_file("gau_sorted.gjf", cart_coords=False)
m_sorted.to("xyz", "sorted.xyz")
May be you do not need to compute radius and angle. Sorting according to z may be enough.
Thank you for your thorough response, it's very helpful. It also inspired me to recheck this issue.
After I looked at this strange problem closely, I think it's the atom 165 that causes the problem, the line is:
N 164 B164 163 A164 144 D164
However, 164, 163 and 144 is almost on a line, so it causes numerial precision issue when calculating dihedral angle and then the error accumulate.
The get_zmatix function (in pymatgen.io.gaussian.GaussianInput class) uses a function called _find_nn_pos_before_site(i) function, this function does not guarantee that the three closest atoms are not (approximatly) collinear.
I actually make the program work properly by use the following code (line 400 in pymatgen.io.gaussian):
#nn = self._find_nn_pos_before_site(i)
nn = [i-1,i-2,i-3]
But this would also not always work. I would suggest the program to check the nn[0] - nn[1] - nn[2] angle, if it's close to +/-180 or 0 degree then del nn[2] and recheck. If len(nn)<3 then raise error.
Another solution is to tackle the numerical precision issue in dihedral angle calculation, but I think use dihedral angle in dangerous region might not be an approprate practice.
Thanks for the debug. I think I know what is happening. But I need to think how best to solve this. There is no unique way of defining the z-matrix and in fact, for something like a nanotube, it is probably even clearer to define it relative to the center axis using virtual atoms since that allows symmetric angles to be defined precisely.
Describe the bug In some cases pymatgen.io.gaussian write_file() gives wrong output when cart_coords=False
To Reproduce Here is the code to reproduce this issue.
The input.xyz is the following:
Expected behavior output.gjf and cart_output.gjf should be the same, but only cart_output.gjf gives the right answer.
Desktop OS: Mac Version: 2020.8.13