ReactionMechanismGenerator / RMG-Py

Python version of the amazing Reaction Mechanism Generator (RMG).
http://reactionmechanismgenerator.github.io/RMG-Py/
Other
397 stars 228 forks source link

QM thermo entries are using non-canonical SMILES as dictionary keys #709

Closed nickvandewiele closed 1 year ago

nickvandewiele commented 8 years ago

When running make eg2 the following error was produced:

Traceback (most recent call last):
  File "rmg.py", line 164, in <module>
    rmg.execute(**kwargs)
  File "/vagrant_data/Code/RMG-Py/rmgpy/rmg/main.py", line 519, in execute
    self.initialize(**kwargs)
  File "/vagrant_data/Code/RMG-Py/rmgpy/rmg/main.py", line 368, in initialize
    self.loadDatabase()
  File "/vagrant_data/Code/RMG-Py/rmgpy/rmg/main.py", line 317, in loadDatabase
    family.addKineticsRulesFromTrainingSet(thermoDatabase=self.database.thermo)
  File "/vagrant_data/Code/RMG-Py/rmgpy/data/kinetics/family.py", line 992, in addKineticsRulesFromTrainingSet
    reactant.thermo = thermoDatabase.getThermoData(reactant, trainingSet=True)
  File "/vagrant_data/Code/RMG-Py/rmgpy/data/thermo.py", line 841, in getThermoData
    shortDesc = thermo0.comment
  File "/vagrant_data/Code/RMG-Py/rmgpy/data/thermo.py", line 328, in loadEntry
    raise DatabaseError('Found a duplicate molecule with label {0} in the thermo library {1}.  Please correct your library.'.format(label, self.name))
DatabaseError: Found a duplicate molecule with label CC1C=C[CH]C=C1_(D) in the thermo library QM Thermo Library.  Please correct your library.
make: *** [eg2] Error 1

The problem is the way QM entries are stored in a ThermoDatabase object:

# Write the QM molecule thermo to a library so that can be used in future RMG jobs.  (Do this only if it came from a QM calculation)
quantumMechanics.database.loadEntry(index = len(quantumMechanics.database.entries) + 1,
    label = original_molecule.toSMILES() + '_({0})'.format(_multiplicity_labels[original_molecule.multiplicity]),
    molecule = original_molecule.toAdjacencyList(),
    thermo = thermo0,
    shortDesc = thermo0.comment

    )                    

Basically, it is using SMILES+multiplicity to create a dictionary key.

This was the list of already existing entries:

Error: entries keys: ['[CH2]C12C=CCC1C2_(D)', 'C=C1C=CC[CH]C1_(D)', '[CH]1C=CC(=C1)C1CC=CC1_(D)', '[CH2]C=CCC=C1C=CC=C1_(D)', '[CH]1C=CC2(C=C1)C=CC=C2_(D)', '[CH]=CC=CC1C=CC=CC=1_(D)', '[CH2]CC1C=CC=C1_(D)', '[CH]1C=CC=C1_(D)', 'CCC1[CH]C=CC=1_(D)', 'C=C1C=CC=C1_(S)', 'C=CC1[CH]CC=C1_(D)', 'C1C=CCC=1_(S)', 'CC1C=C[CH]C=C1_(D)', 'C1=CC=CC=C1_(S)']

This can never work.

We need another canonical identifier to be used as the key.

github-actions[bot] commented 1 year ago

This issue is being automatically marked as stale because it has not received any interaction in the last 90 days. Please leave a comment if this is still a relevant issue, otherwise it will automatically be closed in 30 days.