Closed faribas closed 10 years ago
Starting with the 5-member ring species C1(C(=CC=[C]1)O)OO the reverse template generates the straight species C(=C=COO)C=[C]O when trying to find the degeneracy of that reaction, it tries to generate the reaction in the forward direction, by applying the forward template to the straight C(=C=COO)C=[C]O However, this only generates the 4-member ring species C1(=COO)C(=C[CH]1)O Thus, the reaction that we made in the reverse direction cannot be found in the forward direction.
Nb. Starting with the 4-membered ring runs fine in Py but kills the Java http://rmg/database/kinetics/reaction/reactant1=1%20C%200%20%7B2,S%7D%20%7B8,D%7D;2%20O%200%20%7B1,S%7D%20%7B4,S%7D;3%20O%200%20%7B5,S%7D;4%20O%200%20%7B2,S%7D;5%20C%201%20%7B3,S%7D%20%7B6,D%7D;6%20C%200%20%7B5,D%7D%20%7B7,S%7D;7%20C%200%20%7B6,S%7D%20%7B8,D%7D;8%20C%200%20%7B1,D%7D%20%7B7,D%7D;__product1=1%20C%200%20%7B2,S%7D%20%7B3,S%7D%20%7B5,D%7D;2%20C%200%20%7B1,S%7D%20%7B4,D%7D%20%7B7,S%7D;3%20C%201%20%7B1,S%7D%20%7B4,S%7D;4%20C%200%20%7B2,D%7D%20%7B3,S%7D;5%20C%200%20%7B1,D%7D%20%7B6,S%7D;6%20O%200%20%7B5,S%7D%20%7B8,S%7D;7%20O%200%20%7B2,S%7D;8%20O%200%20%7B6,S%7D;
Putting the fivering C1(C(=CC=[C]1)O)OO through the Java populatereactions gives three intra_R_add... reactions:
fivering(1) --> C5H5O3J(35) 1.617e+10 0.12 27.18 !Intra_R_Add_Exocyclic estimate: (Average:) [ R4_S_D , doublebond_intra_HDe_secNd , radadd_intra_cddouble ] deltaHrxn(T=298K) = -20.73 kcal/mol
fivering(1) --> C5H5O3J(36) 1.617e+10 0.12 27.18 !Intra_R_Add_Exocyclic estimate: (Average:) [ R4_D_D , doublebond_intra_NdNd_pri , radadd_intra_cdsingleNd ] deltaHrxn(T=298K) = -28.46 kcal/mol
C5H5O3J(37) --> fivering(1) 4.067e+08 0.90 30.82 !Intra_R_Add_Endocyclic estimate: (Average:) [ R5_DS_T , triplebond_intra_H , radadd_intra_cdsingleH ] deltaHrxn(T=298K) = -34.74 kcal/mol
35 is a crazy four-and-three-membered bicyclic species
36 is another one
37, which is found via Intra_R_Add_Endocyclic in the reverse direction (this matches the problem in the rmg-py) is some straight chain with a triple bond:
The straight chain species generated from the 5-member-ring by RMG-Py has atom *2
with two double bonds, which is atom type Cdd
. This does not match the top level node *2
multiplebond_intra
1 *2 {Cd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Cdd,Ct,Od} 0 {1,{D,T}}
which is why the kinetics cannot be found in the forward direction, and why RMG-Java does not generate the reaction.
Questions:
1) why does RMG-Py generate the reverse reaction making the Cdd
2) why does RMG-Py NOT generate the reverse reaction making the other species (37) found by RMG-Java?
3) (should in fact Cdd
atom types be allowed to react in this way??!)
Interestingly, the multiplebond_intra
group in the RMG-Java version of Intra_R_Add_Endocyclic (or Intra_R_Add_Exocyclic) does not have the Cdd in *3:
multiplebond_intra
1 *2 {Cd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Ct,Od} 0 {1,{D,T}}
As an aside, I found this in the forbidden group for exo (but not endo):
cdd2
1 *2 Cdd 0
One of the product templates RMG-Py is generating for Intra_R_Add_Endocyclic is
1 *1 Cd 0 {2,D} {5,S}
2 *4 Cd 0 {1,D} {3,S}
3 *5 R!H 0 {2,S} {4,D}
4 *2 {Cs,Cd} 1 {3,D} {5,{S,D}}
5 *3 {Cd,Os,CO,Cs} 0 {1,S} {4,{S,D}}
Note the inconsistency in atom 4: the atom types are only {Cs,Cd}, but the bonds allow for the possibility of Cdd. This structure is generated in part from the R5_DS_allenic
group:
1 *1 Cd 1 {2,D}
2 *4 Cd 0 {1,D}, {3,S}
3 *5 {R!H} 0 {2,S}, {4,D}
4 *2 {Cd,Ct,CO} 0 {3,D}, {5,{D,T}}
5 *3 {Cd,Ct,Od} 0 {4,{D,T}}
which shows even more inconsistency in atom 4, because it allows for the possibility of 5 bonds to a carbon. I'm thinking we should change this to
1 *1 Cd 1 {2,D}
2 *4 Cd 0 {1,D}, {3,S}
3 *5 {R!H} 0 {2,S}, {4,D}
4 *2 Cdd 0 {3,D}, {5,D}
5 *3 {Cd,Od} 0 {4,D}
Based on the presence of this group in the tree, I suspect the answer to question (3) above is yes, these reactions should be allowed.
There are three top level requirements, Rn
, multiplebond_intra
, and radadd_intra
.
The products generated by both RMG-Py and -Java both match Rn (the former is R5_DS_allenic, the latter is R5_DS), but the RMG-Py-generated straight species violates the multiplebond_intra
template (as in comment above).
The reaction template in RMG-Py is defined as Reaction(reactants=[<Entry index=1 label="Rn">], products=['RnCyclic'])
, so only the Rn
requirement is enforced when matching the template (and only this requirement is translated into the revers direction by rmgpy.data.KineticsFamily.generateProductTemplate
), not the other two.
Perhaps this explains question1 - why the reaction was proposed that violates the template?
Still not sure why it missed the valid reaction...
For Intra_R_Add_Endocyclic, in the forward direction (forming a ring), atom 3 is about to form a bond, so it makes no sense for it to start as Cdd, because then it would have 5 bonds. Sounds like the Java database has this right for multiplebond_intra
? Not sure about 2 - perhaps a Cdd here is unlikely to react for steric reasons, but at least it doesn't violate valency rules (as far as I can tell).
Starting RMG-Py with the 37 species made by the Java (see above) , it runs a while then dies with:
Exception: Unable to calculate degeneracy for reaction <Molecule C(=C[O])(C=C=CO)O"> <=> <Molecule "C1(C(=CO1)O)[C]=CO"> in reaction family Intra_R_Add_Exocyclic.
NB. this is now failing to find the degeneracy of the reaction making the 4-ring species in Intra_R_Add_Exocyclic
(original bug was failing to find the 5-ring in Endocyclic)
NB. See https://github.com/GreenGroup/RMG-database/commit/eefa6083a85c01a51080602809c7017644071712 for when the Cdd was added to the *3 position in RMG-Py database (and the comment two above here for why it probably shouldn't be?)
edit: reverting that database commit GreenGroup/RMG-database@eefa6083a85c01a51080602809c7017644071712 does not make the current issue go away
In the forward (ring-forming) direction, atom 3 both forms a single bond and has a double bond become a single bond; I think this is possible starting from Cdd (you end up with a Cd). Similarly, atom 2 gains a radical and has a double bond become a single bond, which I think is again possible from Cdd (to again get Cd). Note that, if you delete one of the bonds in the ring, RMG does correctly find the beta-scission:
Either way, I think R5_DS_allenic is at least part of the problem.
For whether Cdd should be allowed to react, we note that R5_DS_allenic_D is one of the very few (8?) pieces of data in this family (with a rate calculated by @gmagoon - we assue he put it in the right place): http://rmg.mit.edu/database/kinetics/families/Intra_R_Add_Endocyclic/rules/816/
However, the definition requires *2 to include Cdd, as @jwallen noted above. Using his definition of R5_DS_allenic_D, the original issue disappears, and is replaced with:
/Users/fariba/Code/RMG-Py/rmgpy/rmg/main.pyc in initialize(self, args)
299 for spec in self.initialSpecies:
300 spec.generateThermoData(self.database)
--> 301 self.reactionModel.enlarge([spec for spec in self.initialSpecies if spec.reactive])
302
303 # Save a restart file if desired
/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.pyc in enlarge(self, newObject)
538 # Find reactions involving the new species as unimolecular reactant
539 # or product (e.g. A <---> products)
--> 540 newReactions.extend(self.react(database, newSpecies))
541 # Find reactions involving the new species as bimolecular reactants
542 # or products with other core species (e.g. A + B <---> products)
/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.pyc in react(self, database, speciesA, speciesB)
486 if speciesB is None:
487 for moleculeA in speciesA.molecule:
--> 488 reactionList.extend(database.kinetics.generateReactions([moleculeA]))
489 moleculeA.clearLabeledAtoms()
490 moleculeA.makeHydrogensImplicit()
/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactions(self, reactants, products)
3118 reactionList = []
3119 reactionList.extend(self.generateReactionsFromLibraries(reactants, products))
-> 3120 reactionList.extend(self.generateReactionsFromFamilies(reactants, products))
3121 return reactionList
3122
/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactionsFromFamilies(self, reactants, products, only_families)
3167 for label, family in self.families.iteritems():
3168 if only_families is None or label in only_families:
-> 3169 reactionList.extend(family.generateReactions(reactants))
3170 if products:
3171 reactionList = filterReactions(reactants, products, reactionList)
/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactions(self, reactants)
2427 for reaction in reactionList:
2428 reaction.pairs = self.getReactionPairs(reaction)
-> 2429 reaction.template = self.getReactionTemplate(reaction)
2430 if hasattr(reaction,'reverse'):
2431 reaction.reverse.pairs = self.getReactionPairs(reaction.reverse)
/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in getReactionTemplate(self, reaction)
2701 describe the reaction.
2702 ''""
-> 2703 return self.groups.getReactionTemplate(reaction)
2704
2705 def getKineticsForTemplate(self, template, degeneracy=1, method='rate rules'):
/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in getReactionTemplate(self, reaction)
1355 #for product in reaction.products:
1356 # print product.toAdjacencyList() + '\n'
-> 1357 raise UndeterminableKineticsError(reaction)
1358
1359 return template
UndeterminableKineticsError: (Reaction(reactants=[Molecule(SMILES="[C](=CC=C=COO)O")], products=[Molecule(SMILES="C1(C(=CC=[C]1)O)OO")], pairs=[[Molecule(SMILES="[C](=CC=C=COO)O"), Molecule(SMILES="C1(C(=CC=[C]1)O)OO")]]), 'Kinetics could not be determined. ')
Reactant: Product:
UndeterminableKineticsError is raised here because we failed to match the second of the three templates, multiplebond_intra. As noted in above, this is because 2 could not be a Cdd with the old definition. Adding this (and removing it from 3) we have a new multiplebonod_intra:
1 *2 {Cd,Cdd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Ct,Od} 0 {1,{D,T}}
This seems to run without crashing, for the above reaction :-)
However, it then goes on to crash with
UndeterminableKineticsError: (Reaction(reactants=[Molecule(SMILES="C1C=C=CC1(O)O[O]")], products=[Molecule(SMILES="C12(CC=C([CH]1)OO2)O")], degeneracy=2, pairs=[[Molecule(SMILES="C1C=C=CC1(O)O[O]"), Molecule(SMILES="C12(CC=C([CH]1)OO2)O")]]), 'Kinetics could not be determined. ')
Again, unable to match multiplebonod_intra. This time the reactant is a cyclic and the product is a bicyclic thing
Ignore comments above about *3 not being able to start as a Cdd - that was @rwest being stupid (he is writing this)
I think this is closed by faribas/RMG-Database@80086c4bd890d4c29ca0c4d509212ab968d26a64
e85 job crashed again with this error message:
Exception: Unable to calculate degeneracy for reaction <Molecule "C(=C=C)[CH2]"> <=> <Molecule "C1C(=C)[CH]1"> in reaction family Intra_R_Add_Endocyclic
Apparently the bug isn't fixed yet!
Another exception error in degeneracy calculating:
Exception: Unable to calculate degeneracy for reaction <Molecule "C(C[CH2])C=C=C"> <=> <Molecule "C1CC(=C)[CH]C1"> in reaction family Intra_R_Add_Endocyclic.
Reactant:
Product:
Adding 'Cdd' to atom#3 :
index = 2,
label = "multiplebond_intra",
group =
"""
1 *2 {Cd,Cdd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Cdd,Ct,Od,Sd} 0 {1,{D,T}}
""",
Also adding an atom#6 to the 'R5/ R5SS/ R5SS-D' by defining new node and label may helps:
entry(
index = 28,
label = "R5_SS_D",
group =
"""
1 *1 R!H 1 {2,S}
2 *4 R!H 0 {1,S} {3,S}
3 *5 R!H 0 {2,S} {4,S}
4 *2 Cd 0 {3,S} {5,D}
5 *3 Cdd 0 {4,D} {6,D}
6 *6 Cd 0 {5,D}
"""
Seems working without crashing so far!
Job crashed again, but seems working by adding 'Cdd' to atom#3 in entry#27:
entry(
index = 27,
label = "R5_SS",
group =
"""
1 *1 R!H 1 {2,S}
2 *4 R!H 0 {1,S} {3,S}
3 *5 R!H 0 {2,S} {4,S}
4 *2 {Cd,Ct,CO} 0 {3,S} {5,{D,T}}
5 *3 {Cd,Ct,Od,Sd,Cdd} 0 {4,{D,T}}
"""
Also this one looks better for entry#28:
entry(
index = 28,
label = "R5_SS_D",
group =
"""
1 *1 R!H 1 {2,S}
2 *4 R!H 0 {1,S} {3,S}
3 *5 R!H 0 {2,S} {4,S}
4 *2 Cd 0 {3,S} {5,D}
5 *3 {Cd,Cdd} 0 {4,D}
"""
The RMG-Py job crashed again with the same error for this molecule:
Exception: Unable to calculate degeneracy for reaction <Molecule "C(=C[CH2])C=C=[CH]"> <=> <Molecule "C1C(=[CH])[CH]C=C1"> in reaction family Intra_R_Add_Endocyclic.
Reactant:
Product:
Not sure what caused this:
Occurs on the website too: http://rmg.mit.edu/database/kinetics/results/reactant1=1%20C%200%20%7B2,S%7D%20%7B5,S%7D%20%7B6,S%7D%20;2%20C%200%20%7B1,S%7D%20%7B3,D%7D%20%7B7,S%7D%20;3%20C%200%20%7B2,D%7D%20%7B4,S%7D%20;4%20C%200%20%7B3,S%7D%20%7B5,D%7D%20;5%20C%201%20%7B1,S%7D%20%7B4,D%7D%20;6%20O%200%20%7B1,S%7D%20%7B8,S%7D%20;7%20O%200%20%7B2,S%7D%20;8%20O%200%20%7B6,S%7D%20;