ReactionMechanismGenerator / RMG-Py

Python version of the amazing Reaction Mechanism Generator (RMG).
http://reactionmechanismgenerator.github.io/RMG-Py/
Other
394 stars 227 forks source link

Inchi mismatch error in QM #137

Closed connie closed 10 years ago

connie commented 11 years ago

Running into the following inchi mismatch error between log file and geometry for the second time now. Note that the only difference between the inchi's is the final 3 in the string. Seems like they are actually the same molecule. The geometry inchi is the accurate one, however.

Also we do not yet have methods for inchi key collisions..

1 radicals on C12(C(C)(C)OC1CC(C2)(C)OO)CO[O] exceeds limit of 0. Using HBI method.
Output file /home/connieg/QMfiles/LVZYJRSIGFYMQJ-UHFFFAOYSA.out does not (yet) exist.
Trying MopacMolPM3 attempt 1 of 10 on molecule C12(C(C)(C)OC1CC(C2)(C)OO)COO.
InChI in log file (InChI=1S/C10H18O5/c1-8(2)10(6-13-11)5-9(3,15-12)4-7(10)14-8/h7,11-12H,4-6H2,1-3H) didn't match that in geometry (InChI=1S/C10H18O5/c1-8(2)10(6-13-11)5-9(3,15-12)4-7(10)14-8/h7,11-12H,4-6H2,1-3H3).
InChI in log file (InChI=1S/C10H18O5/c1-8(2)10(6-13-11)5-9(3,15-12)4-7(10)14-8/h7,11-12H,4-6H2,1-3H) didn't match that in geometry (InChI=1S/C10H18O5/c1-8(2)10(6-13-11)5-9(3,15-12)4-7(10)14-8/h7,11-12H,4-6H2,1-3H3).
Traceback (most recent call last):
  File "/home/connieg/Code/RMG-Py/rmg.py", line 136, in <module>
    cProfile.runctx(command, global_vars, local_vars, stats_file)
  File "/usr/lib/python2.6/cProfile.py", line 49, in runctx
    prof = prof.runctx(statement, globals, locals)
  File "/usr/lib/python2.6/cProfile.py", line 140, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in <module>
  File "/home/connieg/Code/RMG-Py/rmgpy/rmg/main.py", line 400, in execute
    self.reactionModel.enlarge(objectsToEnlarge)
  File "/home/connieg/Code/RMG-Py/rmgpy/rmg/model.py", line 698, in enlarge
    spec.generateThermoData(database, quantumMechanics=self.quantumMechanics)
  File "/home/connieg/Code/RMG-Py/rmgpy/rmg/model.py", line 124, in generateThermoData
    tdata = database.thermo.estimateRadicalThermoViaHBI(molecule, quantumMechanics.getThermoData)
  File "/home/connieg/Code/RMG-Py/rmgpy/data/thermo.py", line 788, in estimateRadicalThermoViaHBI
    thermoData = stableThermoEstimator(saturatedStruct)
  File "/home/connieg/Code/RMG-Py/rmgpy/qm/main.py", line 147, in getThermoData
    thermo0 = qm_molecule_calculator.generateThermoData()
  File "/home/connieg/Code/RMG-Py/rmgpy/qm/molecule.py", line 252, in generateThermoData
    self.qmData = self.generateQMData()
  File "/home/connieg/Code/RMG-Py/rmgpy/qm/mopac.py", line 198, in generateQMData
    success = self.run()
  File "/home/connieg/Code/RMG-Py/rmgpy/qm/mopac.py", line 68, in run
    return self.verifyOutputFile()
  File "/home/connieg/Code/RMG-Py/rmgpy/qm/mopac.py", line 133, in verifyOutputFile
    return self.checkForInChiKeyCollision(logFileInChI) # Not yet implemented!
AttributeError: MopacMolPM3 instance has no attribute 'checkForInChiKeyCollision'
rwest commented 11 years ago

Any chance the log file was made before the switch to RDKit, and is being checked after?

connie commented 11 years ago

Yes this was still on the main branch. Will try with rdkit branch then.

rwest commented 11 years ago

I think your problem is fixed by 87a7ce9c720a9b28784c895332e43564535de784, but I will leave the issue open for now as a reminder to implement more rigorous checking (or explicitly decide not to).

connie commented 11 years ago

Still running into the same problem. The log file omits an H in the InChI. But I think it is the correct molecule. https://github.com/GreenGroup/RMG-Py/commit/87a7ce9c720a9b28784c895332e43564535de784 will still fail on this.

Warning: InChI in log file (InChI=1S/C10H18O5/c1-10(11,7-4-8-13-12)15-14-9-5-2-3-6-9/h4,7,9,11-12H,2-3,5-6,82,1H3) didn't match that in geometry (InChI=1S/C10H18O5/c1-10(11,7-4-8-13-12)15-14-9-5-2-3-6-9/h4,7,9,11-12H,2-3,5-6,8H2,1H3). Traceback (most recent call last): File "/files/RMG-Py/rmg.py", line 144, in rmg.execute(args) File "/files/RMG-Py/rmgpy/rmg/main.py", line 329, in execute self.initialize(args) File "/files/RMG-Py/rmgpy/rmg/main.py", line 317, in initialize self.reactionModel.enlarge([spec for spec in self.initialSpecies if spec.reactive]) File "/files/RMG-Py/rmgpy/rmg/model.py", line 698, in enlarge spec.generateThermoData(database, quantumMechanics=self.quantumMechanics) File "/files/RMG-Py/rmgpy/rmg/model.py", line 124, in generateThermoData tdata = database.thermo.estimateRadicalThermoViaHBI(molecule, quantumMechanics.getThermoData) File "/files/RMG-Py/rmgpy/data/thermo.py", line 788, in estimateRadicalThermoViaHBI thermoData = stableThermoEstimator(saturatedStruct) File "/files/RMG-Py/rmgpy/qm/main.py", line 147, in getThermoData thermo0 = qm_molecule_calculator.generateThermoData() File "/files/RMG-Py/rmgpy/qm/molecule.py", line 221, in generateThermoData self.qmData = self.generateQMData() File "/files/RMG-Py/rmgpy/qm/mopac.py", line 208, in generateQMData success = self.run() File "/files/RMG-Py/rmgpy/qm/mopac.py", line 67, in run return self.verifyOutputFile() File "/files/RMG-Py/rmgpy/qm/mopac.py", line 134, in verifyOutputFile return self.checkForInChiKeyCollision(logFileInChI) # Not yet implemented! AttributeError: MopacMolPM3 instance has no attribute 'checkForInChiKeyCollision'

rwest commented 11 years ago

Is the log file really just missing that one character? Why on earth would it? It isn't interpreted by MOPAC/GAUSSIAN, just read in and spat out!

Either MOPAC/Gaussian is deleting a character (string too long? Split across lines? Utter bizarreness)...

Or the log file dates from a previous job when we generated InChIs without a missing H (via OB) and are comparing it with a newly created one (via RDKit).

Could it be the latter? Can you figure out a test?

...or it's something else :-D

connie commented 11 years ago

I believe that MOPAC output file is giving the incorrect inchi- might be an internal MOPAC problem. The one from the geometry must be from RDKit(?) and it's the correct one- I checked using the website.

It's not an Openbabel problem because I just ran it fresh with the latest RMG.

In fact, the .mop file contains the correct inchi, but then the .out file contains the one missing the H!

rwest commented 11 years ago

Ok. What happens if you replace it with a string like 1234567890123456790... ? Does MOPAC delete the Nth character or is it something special about that ,8H2, ?

connie commented 11 years ago

There is nothing special about ,8H2, MOPAC appears to be removing the 81th character exactly in the output log file.

Tried a .mop file with this input: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890thisisatestformopacwillitwork

Got this in the output (missing the 'c' in mopac): abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890thisisatestformopawillitwork

connie commented 11 years ago

Perhaps we can allow matching for the first 80 characters if the log file inchi happens to be longer?

connie commented 11 years ago

In any case, the checkForInChiKeyCollision function is missing in both Gaussian and Mopac classes. It appears in QMVerifier (but this class doesn't seem to be used anywhere- was it intended to be a parent class?).