mcocdawc / chemcoord

A python module for manipulating cartesian and internal coordinates.
GNU Lesser General Public License v3.0
72 stars 19 forks source link

change the origin #93

Closed carltta closed 10 months ago

carltta commented 11 months ago

The program per default is setting the origin of the Z matrix to the atom closest to the geometric center of the molecule. However, I would like to chose the atom considered as origin, since in my case it doesn’t correspond always to the geometric center of the molecule. Do you have any suggestions how I could resolve it?

mcocdawc commented 11 months ago

It is indeed no problem to define yourself the beginning of a Z-matrix and let chemcoord figure out the rest.

The method that you need is get_construction_table, which calculates the construction table of a Z-matrix.¹ The crucial part is that you can pass incomplete construction tables to this method and it will figure out the missing pieces. The obtained construction table can then be passed to get_zmat which actually calculates the bond distances, angles, and dihedrals to obtain the Z-matrix.

There is a section about the definition of the construction table in the tutorial here.

If you look for the MIL53_small.xyz structure in chemcoord/Tutorial you can also try the following code:

import chemcoord as cc
import pandas as pd

m = cc.Cartesian.read_xyz('./MIL53_small.xyz', start_index=1)

c_table = pd.DataFrame(
             [[11, 'origin', 'e_z', 'e_x'],
              [7, 11, 'e_z', 'e_x'],
              [6, 7, 11, 'e_x']], columns=['i', 'b', 'a', 'd']).set_index('i')

# This Z-matrix starts with oxygen
m.get_zmat()

# This Z-matrix starts with chromium
m.get_zmat(m.get_construction_table([(m, c_table)]))

¹ A construction table is just the table of indices which defines which atom uses which atoms as reference. It is basically the ['b', 'a', 'd'] columns of a Z-matrix.

carltta commented 11 months ago

Great, thanks a lot!

How did you choose the indexes in the c_table? Because 11 is the index of the atom we want as origin, but 6 and 7?

mcocdawc commented 11 months ago

This was just an example where I used the order Cr - O - Cr, where O is the bridging oxygen. It made chemically the most sense to me, but you can choose differently.

I think¹ that you are forced to fill in at least the first three rows, i.e. the rows which are special in that they need references to absolute points in space.

¹ Very long time ago since I wrote that function.

carltta commented 11 months ago

I still have a problem. I have a xyz file that contains 38 atoms. When I run m.get_zmat() I obtain a Zmatrix with the default origin but containing all the 38 atoms. When I run m.get_zmat(m.get_construction_table([(m, c_table)])) the origin is placed where I want but the resulting matrix has only 14 atoms.

I have also tried to split the .get_zmat function in the following way:

import chemcoord as cc

filexyz = cc.Cartesian.read_xyz('filename.xyz', start_index=0) fileconstrtable = filexyz._get_frag_constr_table(start_atom=1) matrix = filexyz.get_zmat(construction_table=fileconstrtable)

but there are two issues:

  1. the function _get_frag_constr_table generates a table with only some atoms present in the xyz, actually exactly 14, defining this stable as the starting point for the .get_zmat function generates a matrix with only 14 atoms.
  2. we should use the function get_construction_table to "connect" _get_frag_constr_table with get_zmat, but I don't know how this could be possible.
mcocdawc commented 11 months ago

Hmm, that sounds like a bug or misuse if I understand it correctly from your description.¹ Could you attach the xyz file and your python code so that I can reproduce it. Note that if your xyz file consists of more than one fragment, then _get_frag_constr_table is supposed to only create the Z-matrix for one fragment. The Z-matrices for each fragment are then concatenated by code outside _get_frag_constr_table.

¹ Fully noting, that if my code can be easily misused, then it is equally buggy. :-)

carltta commented 11 months ago

I cannot upload the xyz file, so I just changed the format of the .xyz to a .txt. Core.txt

I tried with the code that you suggested and I tried with this code:


import chemcoord as cc

filexyz = cc.Cartesian.read_xyz('Core.xyz', start_index=0)

fileconstrtable = filexyz._get_frag_constr_table(start_atom=1)

matrix = filexyz.get_zmat(construction_table=fileconstrtable)
print(matrix)
mcocdawc commented 11 months ago

If you look at the output of filexyz.fragmentate() you will see that chemcoord does not detect a lot of the bonds in the system, i.e. it fragmentates into several single atom "molecules". This is because a bond is assumed to be present if the distance between two atoms is smaller/equal than the sum of their radii. (Whatever radius you use as a definition.) This was the conceptional problem.

The technical problem is that I really would advise against using _get_frag_constr_table directly. It operates only on single fragments¹ and does not yet return you valid construction table for your whole system. So this code is actually bound to fail.

The clean way way how to cure these problems is to define a larger radius for oxygen or zirkonium. If I do:

cc.constants.elements.loc['O', 'atomic_radius_cc']
cc.constants.elements.loc[:, 'old_atomic_radius_cc'] = cc.constants.elements.loc[:, 'atomic_radius_cc']
cc.constants.elements.loc['O', 'atomic_radius_cc'] = 0.8

I can do

molecule = cc.Cartesian.read_xyz('molecule.xyz', start_index=0)

guess_c_table = pd.DataFrame(
             [[1, 'origin', 'e_z', 'e_x'],
              [8, 1, 'e_z', 'e_x'],
              [2, 8, 1, 'e_x']], columns=['i', 'b', 'a', 'd']).set_index('i')

c_table = molecule.get_construction_table([(molecule, guess_c_table)])
molecule.get_zmat(c_table)

and get a meaningful Z-matrix with 38 atoms.

I think a warning in the get_zmat method about unreasonably small fragments would help a lot. Or what would have helped you there?

carltta commented 11 months ago

It makes sense. Thanks a lot! Here some additional comments:

guess_c_table = pd.DataFrame(
             [[1, 'origin', 'e_z', 'e_x'],
             columns=['i', 'b', 'a', 'd']).set_index('i')
mcocdawc commented 11 months ago
carltta commented 10 months ago

thanks a lot for the clarification and all the help!

No I don't think there is a bug, when running the function get_construction_table I get a complete matrix. It is incomplete only when get_zmat is initiated with a construction table obtained with _get_frag_constr_table.

mcocdawc commented 10 months ago

Great, happy to hear that. Good luck with your project.