ReactionMechanismGenerator / RMG-Py

Python version of the amazing Reaction Mechanism Generator (RMG).
http://reactionmechanismgenerator.github.io/RMG-Py/
Other
396 stars 228 forks source link

Exception: Unable to calculate degeneracy in reaction family Intra_R_Add_Endocyclic #69

Closed faribas closed 10 years ago

faribas commented 12 years ago

Not sure what caused this:

Exploring isomer C1(C(=CC=[C]1)O)OO(13355) in pressure-dependent network #14737
Traceback (most recent call last):
  File "rmg.py", line 142, in <module>
    rmg.execute(args)
  File "/Users/fariba/Code/RMG-Py/rmgpy/rmg/main.py", line 386, in execute
    self.reactionModel.enlarge(objectsToEnlarge)
  File "/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.py", line 559, in enlarge
    newReactions.extend(pdepNetwork.exploreIsomer(newSpecies, self, database))
  File "/Users/fariba/Code/RMG-Py/rmgpy/rmg/pdep.py", line 205, in exploreIsomer
    newReactionList = reactionModel.react(database, isomer)
  File "/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.py", line 488, in react
    reactionList.extend(database.kinetics.generateReactions([moleculeA]))
  File "/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.py", line 3118, in generateReactions
    reactionList.extend(self.generateReactionsFromFamilies(reactants, products))
  File "/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.py", line 3167, in generateReactionsFromFamilies
    reactionList.extend(family.generateReactions(reactants))
  File "/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.py", line 2420, in generateReactions
    reaction.degeneracy = self.calculateDegeneracy(reaction)
  File "/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.py", line 2469, in calculateDegeneracy
    raise Exception('Unable to calculate degeneracy for reaction {0} in reaction family {1}.'.format(reaction, self.label))
Exception: Unable to calculate degeneracy for reaction <Molecule "C(=C=COO)C=[C]O"> <=> <Molecule "C1(C(=CC=[C]1)O)OO"> in reaction family Intra_R_Add_Endocyclic.

Occurs on the website too: http://rmg.mit.edu/database/kinetics/results/reactant1=1%20C%200%20%7B2,S%7D%20%7B5,S%7D%20%7B6,S%7D%20;2%20C%200%20%7B1,S%7D%20%7B3,D%7D%20%7B7,S%7D%20;3%20C%200%20%7B2,D%7D%20%7B4,S%7D%20;4%20C%200%20%7B3,S%7D%20%7B5,D%7D%20;5%20C%201%20%7B1,S%7D%20%7B4,D%7D%20;6%20O%200%20%7B1,S%7D%20%7B8,S%7D%20;7%20O%200%20%7B2,S%7D%20;8%20O%200%20%7B6,S%7D%20;

faribas commented 12 years ago

Starting with the 5-member ring species C1(C(=CC=[C]1)O)OO the reverse template generates the straight species C(=C=COO)C=[C]O when trying to find the degeneracy of that reaction, it tries to generate the reaction in the forward direction, by applying the forward template to the straight C(=C=COO)C=[C]O However, this only generates the 4-member ring species C1(=COO)C(=C[CH]1)O Thus, the reaction that we made in the reverse direction cannot be found in the forward direction.

rwest commented 12 years ago

Nb. Starting with the 4-membered ring runs fine in Py but kills the Java http://rmg/database/kinetics/reaction/reactant1=1%20C%200%20%7B2,S%7D%20%7B8,D%7D;2%20O%200%20%7B1,S%7D%20%7B4,S%7D;3%20O%200%20%7B5,S%7D;4%20O%200%20%7B2,S%7D;5%20C%201%20%7B3,S%7D%20%7B6,D%7D;6%20C%200%20%7B5,D%7D%20%7B7,S%7D;7%20C%200%20%7B6,S%7D%20%7B8,D%7D;8%20C%200%20%7B1,D%7D%20%7B7,D%7D;__product1=1%20C%200%20%7B2,S%7D%20%7B3,S%7D%20%7B5,D%7D;2%20C%200%20%7B1,S%7D%20%7B4,D%7D%20%7B7,S%7D;3%20C%201%20%7B1,S%7D%20%7B4,S%7D;4%20C%200%20%7B2,D%7D%20%7B3,S%7D;5%20C%200%20%7B1,D%7D%20%7B6,S%7D;6%20O%200%20%7B5,S%7D%20%7B8,S%7D;7%20O%200%20%7B2,S%7D;8%20O%200%20%7B6,S%7D;

faribas commented 12 years ago

Putting the fivering C1(C(=CC=[C]1)O)OO through the Java populatereactions gives three intra_R_add... reactions:

fivering(1) --> C5H5O3J(35) 1.617e+10     0.12    27.18     !Intra_R_Add_Exocyclic estimate: (Average:)  [ R4_S_D , doublebond_intra_HDe_secNd , radadd_intra_cddouble ]    deltaHrxn(T=298K) = -20.73 kcal/mol
fivering(1) --> C5H5O3J(36) 1.617e+10     0.12    27.18     !Intra_R_Add_Exocyclic estimate: (Average:)  [ R4_D_D , doublebond_intra_NdNd_pri , radadd_intra_cdsingleNd ]   deltaHrxn(T=298K) = -28.46 kcal/mol
C5H5O3J(37) --> fivering(1) 4.067e+08     0.90    30.82     !Intra_R_Add_Endocyclic estimate: (Average:)  [ R5_DS_T , triplebond_intra_H , radadd_intra_cdsingleH ] deltaHrxn(T=298K) = -34.74 kcal/mol

35 is a crazy four-and-three-membered bicyclic species 35

36 is another one 36

37, which is found via Intra_R_Add_Endocyclic in the reverse direction (this matches the problem in the rmg-py) is some straight chain with a triple bond: 37

faribas commented 12 years ago

The straight chain species generated from the 5-member-ring by RMG-Py has atom *2 with two double bonds, which is atom type Cdd. This does not match the top level node *2

multiplebond_intra
1  *2 {Cd,Ct,CO} 0 {2,{D,T}}
2  *3 {Cd,Cdd,Ct,Od} 0 {1,{D,T}}

which is why the kinetics cannot be found in the forward direction, and why RMG-Java does not generate the reaction.

Questions:

1) why does RMG-Py generate the reverse reaction making the Cdd 2) why does RMG-Py NOT generate the reverse reaction making the other species (37) found by RMG-Java? 3) (should in fact Cdd atom types be allowed to react in this way??!)

jwallen commented 12 years ago

Interestingly, the multiplebond_intra group in the RMG-Java version of Intra_R_Add_Endocyclic (or Intra_R_Add_Exocyclic) does not have the Cdd in *3:

multiplebond_intra
1 *2 {Cd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Ct,Od} 0 {1,{D,T}}

As an aside, I found this in the forbidden group for exo (but not endo):

cdd2
1 *2 Cdd 0 
jwallen commented 12 years ago

One of the product templates RMG-Py is generating for Intra_R_Add_Endocyclic is

1  *1 Cd 0 {2,D} {5,S}
2  *4 Cd 0 {1,D} {3,S}
3  *5 R!H 0 {2,S} {4,D}
4  *2 {Cs,Cd} 1 {3,D} {5,{S,D}}
5  *3 {Cd,Os,CO,Cs} 0 {1,S} {4,{S,D}}

Note the inconsistency in atom 4: the atom types are only {Cs,Cd}, but the bonds allow for the possibility of Cdd. This structure is generated in part from the R5_DS_allenic group:

1 *1 Cd 1 {2,D}
2 *4 Cd 0 {1,D}, {3,S}
3 *5 {R!H} 0 {2,S}, {4,D}
4 *2 {Cd,Ct,CO} 0 {3,D}, {5,{D,T}}
5 *3 {Cd,Ct,Od} 0 {4,{D,T}}

which shows even more inconsistency in atom 4, because it allows for the possibility of 5 bonds to a carbon. I'm thinking we should change this to

1 *1 Cd 1 {2,D}
2 *4 Cd 0 {1,D}, {3,S}
3 *5 {R!H} 0 {2,S}, {4,D}
4 *2 Cdd 0 {3,D}, {5,D}
5 *3 {Cd,Od} 0 {4,D}

Based on the presence of this group in the tree, I suspect the answer to question (3) above is yes, these reactions should be allowed.

faribas commented 12 years ago

There are three top level requirements, Rn, multiplebond_intra, and radadd_intra. The products generated by both RMG-Py and -Java both match Rn (the former is R5_DS_allenic, the latter is R5_DS), but the RMG-Py-generated straight species violates the multiplebond_intra template (as in comment above). The reaction template in RMG-Py is defined as Reaction(reactants=[<Entry index=1 label="Rn">], products=['RnCyclic']), so only the Rn requirement is enforced when matching the template (and only this requirement is translated into the revers direction by rmgpy.data.KineticsFamily.generateProductTemplate), not the other two. Perhaps this explains question1 - why the reaction was proposed that violates the template? Still not sure why it missed the valid reaction...

faribas commented 12 years ago

For Intra_R_Add_Endocyclic, in the forward direction (forming a ring), atom 3 is about to form a bond, so it makes no sense for it to start as Cdd, because then it would have 5 bonds. Sounds like the Java database has this right for multiplebond_intra? Not sure about 2 - perhaps a Cdd here is unlikely to react for steric reasons, but at least it doesn't violate valency rules (as far as I can tell).

faribas commented 12 years ago

Starting RMG-Py with the 37 species made by the Java (see above) , it runs a while then dies with: Exception: Unable to calculate degeneracy for reaction <Molecule C(=C[O])(C=C=CO)O"> <=> <Molecule "C1(C(=CO1)O)[C]=CO"> in reaction family Intra_R_Add_Exocyclic. NB. this is now failing to find the degeneracy of the reaction making the 4-ring species in Intra_R_Add_Exocyclic (original bug was failing to find the 5-ring in Endocyclic)

faribas commented 12 years ago

NB. See https://github.com/GreenGroup/RMG-database/commit/eefa6083a85c01a51080602809c7017644071712 for when the Cdd was added to the *3 position in RMG-Py database (and the comment two above here for why it probably shouldn't be?)

edit: reverting that database commit GreenGroup/RMG-database@eefa6083a85c01a51080602809c7017644071712 does not make the current issue go away

jwallen commented 12 years ago

In the forward (ring-forming) direction, atom 3 both forms a single bond and has a double bond become a single bond; I think this is possible starting from Cdd (you end up with a Cd). Similarly, atom 2 gains a radical and has a double bond become a single bond, which I think is again possible from Cdd (to again get Cd). Note that, if you delete one of the bonds in the ring, RMG does correctly find the beta-scission:

http://rmg.mit.edu/database/kinetics/reaction/reactant1=1%20C%200%20%7B3,D%7D%20%7B4,S%7D;2%20C%200%20%7B3,D%7D;3%20C%200%20%7B1,D%7D%20%7B2,D%7D;4%20O%200%20%7B1,S%7D%20%7B5,S%7D;5%20O%200%20%7B4,S%7D;__reactant2=1%20C%200%20%7B2,D%7D;2%20C%201%20%7B1,D%7D%20%7B3,S%7D;3%20O%200%20%7B2,S%7D;__product1=1%20C%200%20%7B2,S%7D%20%7B5,S%7D%20%7B6,S%7D;2%20C%200%20%7B1,S%7D%20%7B3,D%7D%20%7B7,S%7D;3%20C%200%20%7B2,D%7D;4%20C%200%20%7B5,D%7D;5%20C%201%20%7B1,S%7D%20%7B4,D%7D;6%20O%200%20%7B1,S%7D%20%7B8,S%7D;7%20O%200%20%7B2,S%7D;8%20O%200%20%7B6,S%7D;

Either way, I think R5_DS_allenic is at least part of the problem.

faribas commented 12 years ago

For whether Cdd should be allowed to react, we note that R5_DS_allenic_D is one of the very few (8?) pieces of data in this family (with a rate calculated by @gmagoon - we assue he put it in the right place): http://rmg.mit.edu/database/kinetics/families/Intra_R_Add_Endocyclic/rules/816/

However, the definition requires *2 to include Cdd, as @jwallen noted above. Using his definition of R5_DS_allenic_D, the original issue disappears, and is replaced with:

/Users/fariba/Code/RMG-Py/rmgpy/rmg/main.pyc in initialize(self, args)
    299             for spec in self.initialSpecies:
    300                 spec.generateThermoData(self.database)
--> 301             self.reactionModel.enlarge([spec for spec in self.initialSpecies if spec.reactive])
    302 
    303             # Save a restart file if desired

/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.pyc in enlarge(self, newObject)
    538                     # Find reactions involving the new species as unimolecular reactant

    539                     # or product (e.g. A <---> products)

--> 540                     newReactions.extend(self.react(database, newSpecies))
    541                     # Find reactions involving the new species as bimolecular reactants

    542                     # or products with other core species (e.g. A + B <---> products)

/Users/fariba/Code/RMG-Py/rmgpy/rmg/model.pyc in react(self, database, speciesA, speciesB)
    486         if speciesB is None:
    487             for moleculeA in speciesA.molecule:
--> 488                 reactionList.extend(database.kinetics.generateReactions([moleculeA]))
    489                 moleculeA.clearLabeledAtoms()
    490                 moleculeA.makeHydrogensImplicit()

/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactions(self, reactants, products)
   3118         reactionList = []
   3119         reactionList.extend(self.generateReactionsFromLibraries(reactants, products))
-> 3120         reactionList.extend(self.generateReactionsFromFamilies(reactants, products))
   3121         return reactionList
   3122 

/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactionsFromFamilies(self, reactants, products, only_families)
   3167         for label, family in self.families.iteritems():
   3168             if only_families is None or label in only_families:
-> 3169                 reactionList.extend(family.generateReactions(reactants))
   3170         if products:
   3171             reactionList = filterReactions(reactants, products, reactionList)

/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in generateReactions(self, reactants)
   2427         for reaction in reactionList:
   2428             reaction.pairs = self.getReactionPairs(reaction)
-> 2429             reaction.template = self.getReactionTemplate(reaction)
   2430             if hasattr(reaction,'reverse'):
   2431                 reaction.reverse.pairs = self.getReactionPairs(reaction.reverse)

/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in getReactionTemplate(self, reaction)
   2701         describe the reaction.
   2702         ''""
-> 2703         return self.groups.getReactionTemplate(reaction)
   2704 
   2705     def getKineticsForTemplate(self, template, degeneracy=1, method='rate rules'):

/Users/fariba/Code/RMG-Py/rmgpy/data/kinetics.pyc in getReactionTemplate(self, reaction)
   1355             #for product in reaction.products:

   1356             #    print product.toAdjacencyList() + '\n'

-> 1357             raise UndeterminableKineticsError(reaction)
   1358 
   1359         return template

UndeterminableKineticsError: (Reaction(reactants=[Molecule(SMILES="[C](=CC=C=COO)O")], products=[Molecule(SMILES="C1(C(=CC=[C]1)O)OO")], pairs=[[Molecule(SMILES="[C](=CC=C=COO)O"), Molecule(SMILES="C1(C(=CC=[C]1)O)OO")]]), 'Kinetics could not be determined. ')

Reactant: r Product: p

faribas commented 12 years ago

UndeterminableKineticsError is raised here because we failed to match the second of the three templates, multiplebond_intra. As noted in above, this is because 2 could not be a Cdd with the old definition. Adding this (and removing it from 3) we have a new multiplebonod_intra:

1  *2 {Cd,Cdd,Ct,CO} 0 {2,{D,T}}
2  *3 {Cd,Ct,Od} 0 {1,{D,T}}

This seems to run without crashing, for the above reaction :-)

However, it then goes on to crash with UndeterminableKineticsError: (Reaction(reactants=[Molecule(SMILES="C1C=C=CC1(O)O[O]")], products=[Molecule(SMILES="C12(CC=C([CH]1)OO2)O")], degeneracy=2, pairs=[[Molecule(SMILES="C1C=C=CC1(O)O[O]"), Molecule(SMILES="C12(CC=C([CH]1)OO2)O")]]), 'Kinetics could not be determined. ')

Again, unable to match multiplebonod_intra. This time the reactant is a cyclic and the product is a bicyclic thing

faribas commented 12 years ago

Ignore comments above about *3 not being able to start as a Cdd - that was @rwest being stupid (he is writing this)

faribas commented 12 years ago

I think this is closed by faribas/RMG-Database@80086c4bd890d4c29ca0c4d509212ab968d26a64

faribas commented 11 years ago

e85 job crashed again with this error message:

Exception: Unable to calculate degeneracy for reaction  <Molecule "C(=C=C)[CH2]"> <=> <Molecule "C1C(=C)[CH]1"> in reaction family Intra_R_Add_Endocyclic

Apparently the bug isn't fixed yet!

faribas commented 11 years ago

Another exception error in degeneracy calculating:

Exception: Unable to calculate degeneracy for reaction <Molecule "C(C[CH2])C=C=C"> <=> <Molecule "C1CC(=C)[CH]C1"> in reaction family Intra_R_Add_Endocyclic.

Reactant: r

Product: p

Adding 'Cdd' to atom#3 :

    index = 2,
    label = "multiplebond_intra",
    group = 
"""
1 *2 {Cd,Cdd,Ct,CO} 0 {2,{D,T}}
2 *3 {Cd,Cdd,Ct,Od,Sd} 0 {1,{D,T}}
""",

Also adding an atom#6 to the 'R5/ R5SS/ R5SS-D' by defining new node and label may helps:

entry(
    index = 28,
    label = "R5_SS_D",
    group = 
"""
1 *1 R!H       1 {2,S}
2 *4 R!H       0 {1,S} {3,S}
3 *5 R!H       0 {2,S} {4,S}
4 *2 Cd        0 {3,S} {5,D}
5 *3 Cdd       0 {4,D} {6,D}
6 *6 Cd        0 {5,D}
"""

Seems working without crashing so far!

faribas commented 11 years ago

Job crashed again, but seems working by adding 'Cdd' to atom#3 in entry#27:

entry(
    index = 27,
    label = "R5_SS",
    group = 
"""
1 *1 R!H           1 {2,S}
2 *4 R!H           0 {1,S} {3,S}
3 *5 R!H           0 {2,S} {4,S}
4 *2 {Cd,Ct,CO}    0 {3,S} {5,{D,T}}
5 *3 {Cd,Ct,Od,Sd,Cdd} 0 {4,{D,T}}
"""

Also this one looks better for entry#28:

entry(
    index = 28,
    label = "R5_SS_D",
    group = 
"""
1 *1 R!H          1 {2,S}
2 *4 R!H          0 {1,S} {3,S}
3 *5 R!H          0 {2,S} {4,S}
4 *2 Cd            0 {3,S} {5,D}
5 *3 {Cd,Cdd}   0 {4,D}
"""
faribas commented 11 years ago

The RMG-Py job crashed again with the same error for this molecule:

Exception: Unable to calculate degeneracy for reaction <Molecule "C(=C[CH2])C=C=[CH]"> <=> <Molecule "C1C(=[CH])[CH]C=C1"> in reaction family Intra_R_Add_Endocyclic.

Reactant: r

Product: p