Closed Eldrad-Ulthran closed 7 years ago
Your code can be written a lot more cleaner:
fragmentate
gives a one element list, it is the unfragmented molecule itself. So just don't fragmentate in this case and use the molecule directly.import chemcoord as cc
from itertools import product
ester = cc.Cartesian.read_xyz('ttttt.xyz')
ester_c_table = pd.DataFrame([['origin', 'e_z', 'e_x'],
[0, 'e_z', 'e_x'],
[1, 0, 'e_x'],
[2, 1, 0],
[3, 2, 1],
[4, 3, 2],
[6, 4, 3],
[7, 6, 4]],
columns=['b', 'a', 'd'], index=[0, 1, 2, 3, 4, 6, 7, 8])
ester_const_table = ester.get_construction_table(fragment_list=[(ester, ester_c_table)])
zester = ester.get_zmat(ester_const_table)
symbols = sympy.symbols('a, b, c, d, e')
symb_zester = zester.copy()
symb_zester.safe_loc[[3, 4, 6, 7, 8], 'dihedral'] = symbols
angles = {'t': 180, 'g+': 60, 'g-': 300}
convert = {'g+': 'g-', 'u': 'g-', 'g-': 'g+'}
name_list = [[i, j, 't', k, l] for i, j, k, l in product(*[angles.keys() for _ in range(4)])]
for i, names in enumerate(name_list[:-1]):
name_list[i] = [convert.get(x, x) for x in names]
if (name_list[i] in name_list[(i + 1):]):
name_list[i] = []
name_list = [x for x in name_list if x]
for names in (name_list):
substitutions = list(zip(symbols, [angles[v] for v in names]))
cc.Cartesian.to_xyz(symb_zester.subs(substitutions).get_cartesian(),
str(''.join(names) + '.xyz'))
For your other question: No there is not a finished function in chemcoord to automatically use one specific carbon atom for the construction table that is important for a specific chemical problem lateron. That is not a problem of implementation but even a human can't construct a zmatrix for a specific problem, if the problem is unknown at definition time.
What you can do:
Cartesian.get_construction_table
with your own function. This gives you a Class streamlined for your problem.Cartesian.get_functional_groups('ester')
I will happily accept them in a Pull Request.Thank you very much for cleaning up my code! At first I tried to construct the zmatrix only up to the point of interest, however, as I've written above, this gave birth to an unexpected behaviour: Some of the dihedral angles of the hydrogens were defined in such a way, that they didn't follow the movement/rotation of the carbon atom they're attached to but stayed were they are. The result was a "broken" molecule. This behaviour can be circumvented if it is made sure, that the reference point of the hydrogens are the neighbouring C atoms. A small example to demonstrate what I mean:
Let's take butane C1-C2-C3-C4 as a substitute for my specific ester problem. In the automatically generated zmatrix the reference of the two H on C3 was "bond to C3, angle to C2, dihedral to C1". If now one rotates around the C2-C3 bond (dihedral C1C2C3C4), these H won't move. However, if the references of the two H on C3 are changed to "bond to C3, angle to C2, dihedral to C4" instead, they will move properly.
About your suggestions: Do you have an idea for a starting point for the own function?
Ah sorry, now I got the point.
The problem is that "bond to C3, angle to C2, dihedral to C1" is in my opinion the a priori better (more chemical) selection of references. At least I would build up Zmatrices this way. So this is definetely not a bug in the existing implementation and the generic get_construction_table
won't change.
BUT: If you find a selection of references, where you have the feeling that this is the generically wrong selection please let me know.
Tips for writing your function:
get_bonds
and don't use the cartesian space itself._get_frag_constr_table
. This function assumes, that the argument molecule can not be fragmented. The get_construction_table
lateron calls _get_frag_constr_table
on each fragment and puts the pieces together. So if you have just one molecule, get_construction_table(...)
is basically a call to _get_frag_constr_table
.Cartesian._give_val_sorted_bond_dict
. This returns a connectivity table as dict of sorted sets, sorted by valency. Which came in quite handy throughout writing.
I'd like to ask for suggestions on how to improve my workflow, because at the moment I have to do too much manual work which I would like to automatize.
Task
I want to create a set of conformers of chain-like organic molecules (esters) and save them as input Cartesians. Starting with propyl butanoate I want to systematically enlarge the alkyl chains.
Strategy so far
ttttt.txt Using an all-trans conformer as a starting point (all dihedrals = 180°), I want to change the value of the dihedrals of interest (5 in total) to 60°, 180° or 300°, respectively. For that I load the input Cartesian and set up an own zmatrix in order to know which dihedral I have to change:
Now the "backbone" dihedrals are replaced by symbols:
Next a list of names for each conformer is created, which is used to generate the torsional angles afterwards. Enantiomeric pairs are deleted.
Depending on the letter code the torsional angle is substituted and saved as a Cartesian input file.
Problem description
If I used a completely automatically generated zmatrix I wouldn't see any possibility to predict which number the torsional angles of interest will have. Therefore I need to set up the first few atoms of the zmatrix manually. In order to do so I need to visualize the molecule, realize which C atom is connected to which and properly define the connectivity. However, when at first I let the hydrogens be defined automatically, I realized that not all of them were moved correctly together with their corresponding C atoms, because their dihedrals had unfortunate assignments. The result was, that some of them crashed into the carbons. Therefore I had to explicitly define the connectivity of each and every H atom. Now this manual work is very tedious and very badly scalable for larger molecules. It would be great if it was possible to make my attempt more generally applicable. I would be very glad for any suggestions for improvement!