ReactionMechanismGenerator / RMG-Py

Python version of the amazing Reaction Mechanism Generator (RMG).
http://reactionmechanismgenerator.github.io/RMG-Py/
Other
376 stars 226 forks source link

Disagreement between TST calculations from Cantherm versus MESMER with hindered rotors #981

Closed zjburas closed 7 years ago

zjburas commented 7 years ago

MESMER, like Cantherm, is an open-source package for calculating T,P-dependent kinetics given user-provided quantum calculations. I have recently compared simple high-P TST calculations between the two packages for 4 reactions on the C8H7O PES, and for the majority of cases the two agree. However, in one of the 4 cases the two packages disagree by an order of magnitude when hindered rotors are included.

To get more specific, 2 of the reactions are phenyl radical addition to both the terminal and central Carbon of ketene:

image

image

In both cases, the transition state (TS) has a hindered rotor that neither reactant has: rotation around then new C-C bond being formed. The plots below compare the TST rates predicted by Cantherm (red) and MESMER (blue) for these 2 reactions, with and without rotors. Also shown is the total predicted TST rate (sum of terminal and central addition), for comparison against experimental measurements.

image

image

image

So the two packages give similar results over the entire T range, and both are able to match experiment (further adjusting of the barrier heights within 1 kcal can explain the remaining discrepancy between prediction and experiment).

The other 2 reactions are OH radical addition to both the terminal and central Carbon of phenylacetylene. For terminal addition there are two hindered rotors (around the forming C-O bond and around the only single C-C bond in phenylacetylene) and for central addition there is one (around the single C-C bond).

image

image

The same comparison between packages is made below.

image

image

image

So in the OH + phenylacetylene case, for central addition the two packages are in excellent agreement, but for terminal addition once the hindered rotors are added they disagree significantly. Furthermore, MESMER is able to reproduce the experimental measurements (within the barrier height uncertainty), whereas Cantherm is not, leading me to conclude that the problem is with Cantherm and not MESMER. Additionally, even if we didn't have experiments to tell us which package is "right", I don't think Cantherm's prediction of no effect from the hindered rotors makes sense, especially considering the large effect hindered rotors have in the other 3 cases.

Both Cantherm and MESMER should treat hindered rotors in the same way: using the method outlined in Sharma and Green's paper, so the under underlying theory should be the same. I should also mention that both packages had this ZPE issue, but the calculations above were done after this was fixed.

@mjohnson541 has offered to look at this issue, first by comparing MESMER and Cantherm's implementation of Sharma's method. It's unclear how widespread this problem is, we might have just stumbled on a particularly tricky system since there is hydrogen-bonding and the transition state is submerged. I would be interested though if anyone else has noticed similar discrepancies (either between Cantherm and other packages, between prediction and experiment, or just between no rotor vs. with rotor).

mjohnson541 commented 7 years ago

The projected HO frequencies between MESMER and cantherm appear to be relatively close, but the reduced moments of inertia for each rotor in MESMER are about a factor of two larger than those of Cantherm for the TS they disagree on while they are relatively close for the TS they agree on.

mjohnson541 commented 7 years ago

The reduced moments of inertia are calculated differently in MESMER and Cantherm. In reduced moment of inertia calculations we divide the structure into two sections based on the rotating bond, we then calculate a moments of inertia about each section and reduce them together: (I1*I2)/(I1+I2).

MESMER calculates these moments of inertia about the axis of the rotor bond, while Cantherm calculates them about an axis passing through both sections' centers of mass.

Differences: (in amu*A^2) MESMER Cantherm Phenyl-ketene central K: 48.941 31.385
Phenyl-ketene terminal K: 31.4533 25.4899 OH-Phenyl-acet terminal K1: 0.930619 0.532918 OH-Phenyl-acet terminal K2: 41.8384 25.6106 OH-Phenyl-acet central K: 38.3819 32.066

In general the differences seem to correlate well with larger/smaller differences between MESMER and Cantherm.

mjohnson541 commented 7 years ago

Unfortunately, the reduce moment of inertia changes only improve agreement between MESMER and Cantherm by about a factor of 2. Changing how the potential fit is done to match with MESMER doesn't appear to strongly affect the rates in Cantherm.

mjohnson541 commented 7 years ago

The MESMER fit is almost exactly the same, we have three different ways to calculate the partition function from the fit and all of them give the exact same results in this case. Since the potential fit and the reduced moment of inertia are the only things besides the symmetry number used in calculating the partition function of hindered rotors I find it difficult to accommodate the differences solely within the Hindered Rotors calculation.

mjohnson541 commented 7 years ago

We are now reasonably sure that this issue is a result of a problem with Cantherm's rotor projection algorithm.

zjburas commented 7 years ago

As @mjohnson541 said above, we have isolated the source of the Cantherm/MESMER discrepancy with hindered rotors to Cantherm's frequency projection algorithm. To demonstrate this, we have used MESMER's projected frequencies as inputs to our Cantherm calculations. As shown below, if MESMER's projected frequencies are used in Cantherm, the two packages agree almost exactly over the entire T range with or without hindered rotors, for all 4 reactions. Both sets of predictions also agree with experiment (taking into account the uncertainty of the barrier heights, which can easily account for any remaining discrepancy with experiment).

Also, note that @mjohnson541 has added an option to treat internal rotations as free rotors in Cantherm ( PR #1010), which sets an upper bound on the largest impact that including hindered rotors can have on the predicted k. The free rotors feature is definitely a useful addition to Cantherm, both as a sanity check and as an alternative to always using hindered rotors (i.e., if the barrier to rotation is very low), and should be pulled into master at some point.

First, here are the predictions for OH+phenylacetylene terminal, central and total k:

image

image

image

Second, here are the predictions for phenyl+ketene terminal, central and total k:

image

image

image

So, we know that Cantherm's projectRotors function is the problem, and we will investigate further. Fortunately, we have MESMER to compare against, which uses the same Sharma and Green method to project out rotors.

Finally, I think this issue has manifested itself in other systems as well. For example, @laitcl was predicting thermo properties of benzyl radical that were in large disagreement with experiment if the methylene group was modeled as a hindered rotor. Excluding this hindered rotor allowed him to match experiment, but in theory including hindered rotors should only improve agreement with experiment, if it has any effect at all. Interestingly, benzyl radical is resonance stabilized, as is the OH+phenylacetylene terminal addition TS that gave us so much trouble above. So maybe resonantly stabilized radicals are somehow not being treated properly in projectRotors?

zjburas commented 7 years ago

I think we've finally resolved this issue. See PR #1010 for solution.

The fundamental discrepancy between Cantherm and MESMER was that Cantherm used "Initial Orientation" cartesian coordinates from the Gaussian Log file for species geometries, whereas MESMER uses "Standard Orientation" from the same file. Initial orientation gives the optimized geometry of the species in whatever random Cartesian coordinate system you used in the Gaussian input file, whereas the Standard orientation shifts the origin to the species center of mass and rotates the axes to align with the principle axes (thereby "standardizing" the coordinate system). The choice of coordinate system matters, because the Hessian, which is needed for projecting out hindered rotors, is defined in the Standard orientation. Therefore, if the species geometry in Initial Orientation is used, there is a mismatch between that geometry and the Hessian coordinate system, resulting in incorrect projection of both external and internal modes, resulting in turn in incorrect projected frequencies, incorrect vibrational partition functions, and incorrect kinetics/thermo, as we've seen.

The solution is simple (as implemented in PR #1010 ): have Cantherm load the "Standard Orientation" instead of "Input Orientation". This is what MESMER does. However, there will not be a "Standard Orientation" in the Gaussian log if the "nosymm" keyword is specified in the Gaussian input. Most users do not specify this keyword though, so for the majority of cases it should not be a problem. For those few cases where "nosymm" is specified in the Gaussian optimization input, Cantherm will not read any coordinates and will give an error telling the user to re-run the optimization without "nosymm". Of course there are other solutions, but this was just a quick fix.

Finally, as evidence that this change has a significant impact on Cantherm's output, below are comparisons of Cantherm and MESMER'S kinetic predictions for OH+phenylacetylene and phenyl+ketene entrance rates (both terminal and central additions), as well as comparisons to experimental measurements. In particular, OH+phenylacetylene terminal addition was not predicted consistently between the two softwares previously (see above), but now they both agree with each other almost exactly, and most importantly they both agree with experiment (within barrier height uncertainty). Also, note that prediction using @mjohnson541 's new FreeRotor feature of Cantherm are also shown, which puts on upper bound on how fast the rate can be (this feature is also part of PR #1010).

OH+phenylacetylene terminal addition

image:

image

OH+phenylacetylene central addition

image

image

OH+phenylacetylene total k (terminal + central)

image

Phenyl + ketene terminal addition

image

image

Phenyl + ketene central addition

image

image

Phenyl + ketene total k (terminal + central)

image

As further evidence that after making this change Cantherm's predictions are more reasonable, consider @laitcl 's predictions of Benzyl radical entropy below:

image

Previously, if the methylene group in benzyl was modeled as a hindered rotor, Cantherm would significantly overpredict the entropy of that radical (INPUT ORIENTATION case), as compared to the Narayanaswamy library (which in turn ultimately comes from Ruscic's widespread calculations). This was confusing because that rotor has a high barrier to rotation (~10 kcal), and therefore there should be very little entropy increase from modeling that internal motion as a highly constrained hindered rotor as opposed to a Harmonic oscillator (RRHO case, also shown above). Now, if Cantherm is used with STANDARD ORIENTATION as the geometry input, the calculated entropy is only slightly higher than the RRHO case, and in excellent agreement with Narayanaswamy (and is therefore in agreement with Ruscic, and every major combustion mechanism).

If anyone else has obtained unusual predictions from Cantherm (either for kinetics or thermo), please test your job on this new branch (CanthermRotors), and add your comparison to the discussion here. If one of the quantum chemists in the group can also chime in that would be helpful. In particular, I just want to confirm that the Hessian is indeed defined in Standard Orientation coordinates. I have also contacted the developers of MESMER for their input, because they have obviously spent some time thinking about this issue.

If no one objects, I hope that the new PR can be merged into master soon.

rwest commented 7 years ago

Good work, @zjburas! Slightly worrying how much we've been using cantherm...

zjburas commented 7 years ago

Ok, the issue is really solved now. The explanation above was close, but not quite right.

The fundamental issue is still that sometimes the molecule geometry and Hessian are input in different Cartesian coordinates. The Hessian that Cantherm reads (and that is output by the iop(7/33=1) Gaussian option) has the header "Force constants in Cartesian coordinates: " in the Gaussian log file. However, this description does not say which Cartesian coordinate system. After talking with Gaussian tech support, they confirmed that with the iop(7/33=1) keyword, that Hessian is always defined in the Input Orientation, i.e., in whatever coordinate system the molecule was in the input, not the Standard orientation. This is true whether the user specifies "nosymm" or not. In fact, if you don't specify nosymm, at a certain point in the Gaussian log it will say: "Axes restored to original set". Above this line, everything is defined in Standard orientation, but below this line everything (including the Hessian) gets rotated back into the Input orientation. So clearly, the molecule geometry that Cantherm reads in should be consistent with the Input Orientation coordinate system of the freq job. However, currently Cantherm does not check if this is the case and so it is entirely up to the user to input a geometry and Hessian in consistent coordinates.

In my case, for the OH+phenylacetylene predictions, I messed up the input in the following way: molecule geometries were first optimized, and then Gaussview was used to create the subsequent frequency jobs with the iop(7/33=1) keyword to output the Hessian. However, by default the input file prepared by Gaussview uses a Z-matrix, instead of cartesian coordinates, so the Input Orientation used in the opt job was lost, and at the start of the freq job, Gaussian arbitrarily defines a new Input Orientation with the first atom at the origin. Cantherm would then read the geometry in the Input Orientation of the opt job, and the Hessian in the Input Orientation of the freq job, which were not consistent. This of course resulted in incorrect projecting of external and internal modes, incorrect frequencies, partition functions, etc. MESMER also does not check that the user is inputting a geometry/Hessian pair that are in consistent coordinates, so my input there was also wrong, although the mismatched pair was different at first.

In @laitcl 's case, he also spawned a freq job based on his final opt geometry, but using Cartesian coordinates (no Z-matrix). His problem, was that whatever software he used to generate the freq input file, used the final Standard orientation of the opt job. So his mismatch was that Cantherm read a geometry in Input Orientation, and a Hessian in Standard Orientation.

As these two examples show, it is very easy for a user to accidentally input inconsistent coordinate systems. The fool-proof solution seems to be forcing the user to input their freq job log file for both the geometry and frequencies/Hessian. Cantherm will then use the Input Orientation molecule geometry for that job, which will always be in the same coordinate system as the Hessian in that same job. Using this approach means that the user will not have to re-run any QM jobs, they just have to change which files are referenced in their Cantherm input. This solution is implemented in PR #1010 . The current PR only addresses cases where the user supplies Gaussian log files. If QChem is used instead, a warning is output to alert the user to check that their coordinate systems are consistent. Perhaps someone with QChem experience can add a more thorough fail-safe solution.

Finally, if consistent geometry/Hessian pairs are supplied to Cantherm and MESMER they will both give similar results with hindered rotors (see below), although we are now back to disagreeing with experiment. I don't think the experimental discrepancy reflects a bug in Cantherm/MESMER, it is probably a limitation of the level of theory we are using. Both Cantherm and MESMER predict that including hindered rotors increases the rate only by a factor of 3 at most, which is more reasonable than the orders of magnitude increase we were seeing before.

The benzyl thermo calculations compared in the previous post are still valid. This is a clear case where correcting the geometry/Hessian mismatch demonstrably made Cantherm's predictions more accurate. I've also gone back to some of my other Cantherm calculations with hindered rotors (propargyl recombination surface) and am able to get more reasonable predictions now.

I also noticed that the projected frequencies output by Cantherm are more sensible, and it is possible to verify which removed frequency corresponds to which hindered rotor. At one point @laitcl was trying to create an automatic test to check that the projected frequencies make sense, and I think that such a test could (and should) now be made on top of PR #1010 .

Bottom line: if users were careful in the past to ensure consistent geometry/Hessian pairs were used in their Cantherm input (with hindered rotors on), then their predictions are fine. However, for more naive users like me and @laitcl , there are many pitfalls, and Cantherm should be made smart enough to prevent users from following into the common ones.

OH+phenylacetylene terminal addition

image:

image

OH+phenylacetylene central addition

image

image

OH+phenylacetylene total k (terminal + central)

image

Phenyl + ketene terminal addition

image

image

Phenyl + ketene central addition

image

image

Phenyl + ketene total k (terminal + central)

image

KEHANG commented 7 years ago

I think once @zjburas and @alongd update the cantherm documentation, this issue can be closed.

zjburas commented 7 years ago

Working on documentation now, hope to have it done today.

mjohnson541 commented 7 years ago

Since the documentation has been merged I'm closing this issue.