corykinney commented 2 years ago

Problem description

The AramcoMech 3.0 mechanism (and NUIGMech 1.1, which containts the same problematic reactions) fails to validate after Cantera 2.6.0 was released. This reaction mechanism did not cause issues with Cantera 2.5; this behavior specifically started occurring between 2.6.0a4 and 2.6.0b2.

Steps to reproduce

Download the CHEMKIN files for AramcoMech 3.0 from NUI Galway
Convert these files to Cantera format with ck2yaml
See error

CanteraError thrown by PlogRate::validate:
Invalid rate coefficient for reaction 'C4H6 <=> C3H3 + CH3'
at P = 15999, T = 200.0
Invalid rate coefficient for reaction 'C4H6 <=> C3H3 + CH3'
at P = 31997, T = 200.0

the reaction that is causing this error is:

- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 0.0394737 atm, A: 1.5849e+148, b: -37.24, Ea: 1.885e+05}
  - {P: 0.0789474 atm, A: 8.9125e+159, b: -40.32, Ea: 2.013e+05}
  - {P: 0.157895 atm, A: 5.2481e+196, b: -50.0, Ea: 2.436e+05}
  - {P: 0.315789 atm, A: 4.0738e+196, b: -50.0, Ea: 2.414e+05}

The pre-exponential factors are absurdly high, of course, which was discussed in the Cantera Google group discussion as a potential cause of the issue. As suggested by @bryanwweber I calculated the corresponding reaction rates, and the rates should be positive, although they are smaller than 1e-100 if I recall correctly. It's possible that I calculated them incorrectly, but otherwise I do not see why Cantera should fail to validate the mechanism. Perhaps there is a precision error where the reaction rate is truncated and becomes zero, which might fail the validation?

System information

Cantera version: 2.6.0
Python version: 3.10

ischoegl commented 2 years ago

@cory-kinney ... thanks for reporting this issue. For context, the offending code section where this fails is https://github.com/Cantera/cantera/blob/577455cf143178407729f1fdcb9085570dc84b3d/src/kinetics/PlogRate.cpp#L146-L153

corykinney commented 2 years ago

@ischoegl would it be appropriate to change this to !(k >= 0) (or just k < 0) to allow for rates that might truncate to zero but still disallow negative rates?

ischoegl commented 2 years ago

@cory-kinney … Interesting question. However, the log of a number that gets truncated to zero is still problematic? (I’m actually not sure why k=0 is allowed in this validation check).

corykinney commented 1 year ago

@ischoegl k=0 is not allowed in the code section you linked. I'm not entirely sure if the rate is calculated as zero or if it's calculating a negative rate.

Regardless, is there a way to override these checks that are saying the mechanism would fail at 200 K when that is irrelevant to the simulation that will be run? I know this has been discussed in https://github.com/Cantera/enhancements/issues/54 but it doesn't look like there was a resolution.

Widely used mechanisms such as AramcoMech 3.0 and others from NUI Galway are no longer usable with releases after 2.6.0a4 due to these checks when most of the time it wouldn't matter. It would be great if there was a flag or something to make these appear as warnings instead of exceptions so that the user is aware of the issue but isn't unnecessarily restricted by Cantera.

corykinney commented 1 year ago

Also tagging @bryanwweber since he was involved in the original discussion on the Google group.

ischoegl commented 1 year ago

I looked at the commit history, and the check goes back at least 8 years (50a08db7c09f9a4cdb0c3598b42818c4717b265c), with a more recent change to improve output in 2cd06194e80aa0f43cba36575ecf1cdc6c073391 ... the current version of the code is the result of refactoring of reaction rate evaluators, which involved copying things around. I am actually not sure why checks pass prior to 2.6 - it's certainly not because of changes in this section.

For the original code conversion, you can pass the --no-validate option to ck2yaml, but I'm not sure that this resolves your issues.

corykinney commented 1 year ago

I looked at the commit history, and the check goes back at least 8 years (50a08db), with a more recent change to improve output in 2cd0619 ... the current version of the code is the result of refactoring of reaction rate evaluators, which involved copying things around. I am actually not sure why checks pass prior to 2.6 - it's certainly not because of changes in this section.

For the original code conversion, you can pass the --no-validate option to ck2yaml, but I'm not sure that this resolves your issues.

Unfortunately passing --no-validate to ck2yaml might prevent the issue when converting the mechanisms, but anytime the mechanism is loaded with ct.Solution(...) the checks occur again and prevent them from being used.

In the original commit the check is if (!(k >= 0)) versus the current if (!(k > 0)). I just checked the blame for the current version of the file and it looks like this change was made in 9b4f4de which ended up in 2.6.0b1 (which makes sense as to why 2.6.0a4 is the last pip installable release that I can successfully use these mechanisms with). It looks like @speth made the change. Is there a reason why k = 0 was considered valid until recently and is now disallowed?

ischoegl commented 1 year ago

Thanks for pointing out the slightly different conditions (which essentially is k <= 0 in its current form). Blocking k == 0 makes sense mathematically as a logarithm would create issues; in this case, it may be that a positive value is rounded to zero due to limited machine precision. Fwiw, here's the rate evaluation step:

    //! Evaluate reaction rate
    double evalRate(double logT, double recipT) const {
        return m_A * std::exp(m_b * logT - m_Ea_R * recipT);
    }

It should be easily verifiable whether the exponential becomes exceedingly small?

- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 0.0394737 atm, A: 1.5849e+148, b: -37.24, Ea: 1.885e+05}
  - {P: 0.0789474 atm, A: 8.9125e+159, b: -40.32, Ea: 2.013e+05}
  - {P: 0.157895 atm, A: 5.2481e+196, b: -50.0, Ea: 2.436e+05}
  - {P: 0.315789 atm, A: 4.0738e+196, b: -50.0, Ea: 2.414e+05}

It doesn't look like the pre-exponential factors are at fault - they are clearly positive. Whether those numbers makes actual physical sense is beyond me ...

speth commented 1 year ago

Allowing $k=0$ causes the interpolation formula for these rates to return NaN, since $\infty - \infty$ is not defined. I think the validation check here should have always failed when $k$ evaluates to exactly zero.

For debugging purposes, you can get a better picture of what's happening here by creating a few elementary reactions with these rate constants (where zeros aren't a problem), and including just these few reactions in the input file:

reactions:
- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  rate-constant: {A: 1.5849e+148, b: -37.24, Ea: 1.885e+05}
  duplicate: true
- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  rate-constant: {A: 8.9125e+159, b: -40.32, Ea: 2.013e+05}
  duplicate: true
- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  rate-constant: {A: 5.2481e+196, b: -50.0, Ea: 2.436e+05}
  duplicate: true
- equation: C4H6 <=> CH3 + C3H3  # Reaction 2176
  rate-constant: {A: 4.0738e+196, b: -50.0, Ea: 2.414e+05}
  duplicate: true

This way, we can actually import the mechanism and play around with it a little.

>>> import cantera as ct
>>> import numpy as np
>>> gas = ct.Solution("pdep-parts.yaml")
>>> gas.TP = 300, ct.one_atm
>>> print(gas.forward_rate_constants)
array([4.29145770e-082, 2.68202286e-087, 2.54109265e-105, 7.90105084e-104])
>>> gas.TP = 260, ct.one_atm
>>> print(gas.forward_rate_constants)
>>> array([6.62111473e-101, 2.36401075e-107, 0.00000000e+000, 0.00000000e+000])

As you can see, below this temperature, the rate evaluates to exactly zero, as a result of floating point underflow. While at first glance it might seem that we should still be pretty far from the limit of the smallest values representable in a double (around 1e-300), the problem comes from the fact that the intermediate result, equivalent to evaluating $T^b \exp(E_a/RT)$ falls below this limit:

>>> rate = gas.reaction(2).rate
>>> T = 260
>>> print(T**rate.temperature_exponent)
1.7837443128686528e-121
>>> print(np.exp(-rate.activation_energy / (ct.gas_constant*T)))
1.7366368018543418e-205
>>> print(T**rate.temperature_exponent * np.exp(-rate.activation_energy / (ct.gas_constant*T)))
0.0

I think the best way to handle this would be to implement controllable temperature bounds for Kinetics calculations, where the bounds could be determined in part by automatically identifying the range over which the rate constants are mathematically/numerically valid.

A simpler band-aid for the 2.6 branch might be to just increase the lowest validation temperature to 300 K, which seems to be fine for AramcoMech 3.0.

ischoegl commented 1 year ago

@speth ... thanks for confirming my suspicion. I agree that it should have failed before the change, as allowing for k=0 makes little sense mathematically.

I think the best way to handle this would be to implement controllable temperature bounds for Kinetics calculations, where the bounds could be determined in part by automatically identifying the range over which the rate constants are mathematically/numerically valid.

A simpler band-aid for the 2.6 branch might be to just increase the lowest validation temperature to 300 K, which seems to be fine for AramcoMech 3.0.

I personally don't believe that we should change the way the validation checks are run: it is clearly on the mechanism developers to avoid coefficient values that are prone to numerical artifacts (as is the case here; the numerical values are imho problematic). Rather, I'd advocate for a permissive setting when importing the mechanism in ct.Solution, which would convert errors to warnings, and defaults to false (as in: strict validation). We can use the same flag to bypass other failures that may be encountered when importing mechanisms that have issues.

speth commented 1 year ago

I don't think we should (or need to) require that rate parameterizations have no problems at any temperature (where "any" is pretty hard to define, especially if you think Cantera is for more than just combustion). The current set of temperatures used for validation is just a somewhat arbitrary set that I wrote down when initially implementing P-log rates.

This particular mechanism works fine for 300 K and up, which is probably all most users of it will care about. If you use it only in this temperature range, there's nothing wrong and the user shouldn't need to specify a "permissive" flag. But in a temperature range where a Plog rate does evaluate to zero or a negative value, just carrying on with a "permissive" flag isn't helpful -- the calculations will still result in NaNs. This is quite different than permitting things like undeclared duplicates that don't affect Cantera's ability to do the calculations.

I think this would also an opportunity to start looking at temperature limits for Chebyshev reactions, which is something we don't currently check, and perhaps allow explicitly set temperature limits for other rate types as well.

corykinney commented 1 year ago

Allowing k=0 causes the interpolation formula for these rates to return NaN, since ∞−∞ is not defined. I think the validation check here should have always failed when k evaluates to exactly zero.

As you can see, below this temperature, the rate evaluates to exactly zero, as a result of floating point underflow. While at first glance it might seem that we should still be pretty far from the limit of the smallest values representable in a double (around 1e-300), the problem comes from the fact that the intermediate result, equivalent to evaluating Tbexp⁡(Ea/RT) falls below this limit:

@speth This makes sense, thanks for explaining these! I thought it was due to underflow of some sort, but was confused about the range as I didn't think about the intermediate result.

I personally don't believe that we should change the way the validation checks are run: it is clearly on the mechanism developers to avoid coefficient values that are prone to numerical artifacts (as is the case here; the numerical values are imho problematic).

@ischoegl My two cents is that validation that Cantera performs to encourage responsible mechanism development should continue to be used and expanded, but that the user shouldn't ever be prevented from doing something that may lead to an error - in other words, I think that users should be warned of potential issues and they should be trusted to either operate within ranges that the mechanism is valid, or have the option to continue knowing that the simulation may stop if it an invalid rate would be encountered. Otherwise the end user gets punished for the mistakes of the mechanism developer (often for mechanisms that are years old and weren't designed with Cantera to begin with). I do hope that more mechanism developers will choose to develop with Cantera in the future though, especially given yaml2ck functionality.

I don't think we should (or need to) require that rate parameterizations have no problems at any temperature (where "any" is pretty hard to define, especially if you think Cantera is for more than just combustion). The current set of temperatures used for validation is just a somewhat arbitrary set that I wrote down when initially implementing P-log rates.

This particular mechanism works fine for 300 K and up, which is probably all most users of it will care about. If you use it only in this temperature range, there's nothing wrong and the user shouldn't need to specify a "permissive" flag. But in a temperature range where a Plog rate does evaluate to zero or a negative value, just carrying on with a "permissive" flag isn't helpful -- the calculations will still result in NaNs. This is quite different than permitting things like undeclared duplicates that don't affect Cantera's ability to do the calculations.

I agree with this. Perhaps initial validation checks should be issued as warnings by default instead of as errors. And then maybe if a user tries to insert a Solution object into a Reactor object at a state outside of the previously checked valid range, then an actual error could be thrown since it is known that calling advance or step on the reactor network will fail. Of course if an invalid rate were calculated, Cantera should never just carry on, I don't think that was the suggestion.

corykinney commented 1 year ago

On a related note, is it possible to disable validation when initializing a Solution object? If not, I think it would be a valuable addition for optimization when the user already knows that the mechanism is valid, especially for large mechanisms or situations with a large batch job utilizing multiprocessing is being performed where Solution objects are being reinitialized for each job.

ischoegl commented 1 year ago

Thanks, @cory-kinney ... from my perspective, I agree on what you said. When it comes to using reaction mechanisms, the rule is caveat emptor. There should be a way to disable errors, proceed with warnings and take results with a grain of salt. Whether or not Cantera throws error for 200K or 300K is certainly up for discussion. In the case at hand, coefficient values do raise eyebrows. I don't see a reason to support something that numerically evaluates as zero ... while the values chosen for validation tests are arbitrary this slope is quite slippery.

speth commented 1 year ago

I think we need to distinguish between two categories of validation.

Some checks, like those for undeclared duplicate reactions or duplicate thermo data, are meant to detect things like copy/paste errors in input files. In this case, I agree that the user should have the ability to skip these validation tests. I think that whether an individual check is a warning or an error by default should probably be a case-by-case decision.

Others, like this, will lead to downstream errors every single time (if you assume the automatic determination of temperature bounds that I described above). You can't even evaluate the forward rate constants without getting a NaN. I'd rather not have to deal with the eventual support requests that say "I tore off the guardrails and then my simulation crashed".

I'll try to write up an Enhancement proposal with a more complete description of how I'd like to implement temperature bounds for kinetics calculations. I think there's also an extension of this to thermo that might be useful as well.

corykinney commented 1 year ago

@speth sounds like a good plan! Woud it be alright if I went ahead and submitted a pull request for changing the lowest temperature in the current validation check from 200 K to 300 K for the time being?

chengdi1988 commented 1 year ago

Can we use a different but equivalent formulation to avoid log(0) errors?

Cause

In the Plog document. The final rate is:

ln(k) = ln(k1) + (ln(k2)-ln(k1))*((ln(P) - ln(P1) )/ (ln(P2) - ln(P1) ))

because k1 or k2 is calculated to zero, the log function returns nan or report an error.

Analysis

Usually we do not need intermediate rate constants k1 and k2, we only need the final rate constant k, so why not skip k1 and k2.

Algorithm

step1. substitute in the log version of modified Arrhenius formula.

ln_k  = ln(A1) + (ln(A2)-ln(A1))*((ln(P) - ln(P1) )/ (ln(P2) - ln(P1) )) + 
             ln(T)*( b1 + (b2 - b1)*((ln(P) - ln(P1) )/ (ln(P2) - ln(P1) )))+ 
             1/RT*(E1+ (E2-E1)*((ln(P) - ln(P1) )/ (ln(P2) - ln(P1) )))
# it is a widely used practice to use pre-calculated T inversed, log(T) and log(A1) to save time consuming exp().

step2. calculate exponential.

k = exp(ln_k)

The formulae requires that A1,A2 should not be zero.

For chebyshev reactions, I think this method works too.

speth commented 1 year ago

Hi @chengdi1988, thanks for the comments. In the normal rate calculation for these rates, Cantera does work directly with $\ln(k_f(P_1))$ and avoids the exponentiation step when possible (as well as implementing the other optimizations you've mentioned). However, when multiple rates are provided at a single pressure, you must compute compute $\ln(k_f(P_1)) = \ln\left[\sumi k{f,i}(P_1)\right]$ and there is no shortcut. You can see this in the implementation:

https://github.com/Cantera/cantera/blob/1f1751b16c6f73234984b88118a094f724f05c0c/include/cantera/kinetics/PlogRate.h#L140-L163

The same special case is not implemented in the validation checks:

https://github.com/Cantera/cantera/blob/1f1751b16c6f73234984b88118a094f724f05c0c/src/kinetics/PlogRate.cpp#L140-L153

though there is no reason that it couldn't be. I think that would resolve the specific problem raised here, but not the more general issues with validating these rate constants that's described in https://github.com/Cantera/enhancements/issues/54.

chengdi1988 commented 1 year ago

Hi @chengdi1988, thanks for the comments. In the normal rate calculation for these rates, Cantera does work directly with ln⁡(kf(P1)) and avoids the exponentiation step when possible (as well as implementing the other optimizations you've mentioned). However, when multiple rates are provided at a single pressure, you must compute compute ln⁡(kf(P1))=ln⁡[∑ikf,i(P1)] and there is no shortcut. You can see this in the implementation:

https://github.com/Cantera/cantera/blob/1f1751b16c6f73234984b88118a094f724f05c0c/include/cantera/kinetics/PlogRate.h#L140-L163

The same special case is not implemented in the validation checks:

https://github.com/Cantera/cantera/blob/1f1751b16c6f73234984b88118a094f724f05c0c/src/kinetics/PlogRate.cpp#L140-L153

though there is no reason that it couldn't be. I think that would resolve the specific problem raised here, but not the more general issues with validating these rate constants that's described in Cantera/enhancements#54.

What is "multiple rates are provided at a single pressure"? Is it a reasonable Plog reaction?

speth commented 1 year ago

Yes, these show up in various mechanisms. Here are a couple of examples: https://github.com/Cantera/cantera/blob/055a530af437c3705a5ddff1e1d05b69346939cc/test/data/pdep-test.yaml#L255-L277

chengdi1988 commented 1 year ago

You are right. I see this description in ansys chemkin-pro's theory manual 3.6.3 "General Pressure Dependence Using Logarithmic Interpolation"

Yes, these show up in various mechanisms. Here are a couple of examples:

https://github.com/Cantera/cantera/blob/055a530af437c3705a5ddff1e1d05b69346939cc/test/data/pdep-test.yaml#L255-L277

Cantera / cantera

Validation for PLOG rates fails for extremely high pre-exponential factors after Cantera 2.6 release #1405

Cause

Analysis

Algorithm