Unrealistic ThermoData-to-Wilhoit fits when high B values are used

Certain cases seem to produce wacky Wilhoit fits. (below is using my older version of RMG-Py...different syntax, but essentially the same results):

>>> GAthermoData=thermo.ThermoGAData(7.82664135375*4184, 4.184*95.1880666, [43.3974477932*4.184, 57.8173868743*4.184, 70.3405199487*4.184, 80.7138798286*4.184, 89.3158731778*4.184, 96.548668945*4.184, 102.703236774*4.184, 107.979467115*4.184, 112.522333623*4.184, 116.444933555*4.184, 119.840135404*4.184, 122.786136786*4.184, 125.349336785*4.184])
>>> WilhoitData = thermo.convertGAtoWilhoit(GAthermoData, atoms=26, rotors=2, linear=False , fixedB=0, Bmin=300.0, Bmax=3000.0)
>>> WilhoitData.getHeatCapacity(5000)
-778.35237464421903
>>> WilhoitData = thermo.convertGAtoWilhoit(GAthermoData, atoms=26, rotors=2, linear=False , fixedB=0, Bmin=300.0, Bmax=1500.0)
>>> WilhoitData.getHeatCapacity(5000)
624.60869279226233
>>> print WilhoitData
ThermoWilhoitData(33.26,623.6,0.666,-10.26,14.78,-7.879,-1.384e+006,-3588,'Wilhoit function fitted to GA data with Cp0=33.2579 and Cp_inf=623.585. RMS error = 0.021*R. ',B=535.6)
>>> WilhoitData2 = thermo.convertGAtoWilhoit(GAthermoData, atoms=26, rotors=2, linear=False , fixedB=0, Bmin=300.0, Bmax=3000.0)
>>> print WilhoitData2
ThermoWilhoitData(33.26,623.6,-54.5,303.4,-727,659.1,2.594e+009,-4614,'Wilhoit function fitted to GA data with Cp0=33.2579 and Cp_inf=623.585. RMS error = 0.028*R. ',B=3000)

The above shows how when B is allowed to float up to 3000, there is an unrealistic Wilhoit solution which produces a negative heat capacity at T=5000 K. This is apparently due to the fact that when B goes as high as 3000 K, the training data, scaled according to y=T/(T+B) only goes up to y=1/3; between y=1/3 and y=1, there is no training data, except the constraint that the result be CpInf at y=1. In this case, the fitted polynomial has taken on a value producing negative heat capacities during part of the range between y=1/3 and y=1.

Also of note: the above shows that an even better fit may be obtained with B=535.6 K. Further investigation shows that the objective function has two minima on the range 300-3000K (one at 535.6K and one at the upper boundary (3000 K)) and two maxima (one at the lower boundary, and one around 1600 K). The optimization algorithm used, fminbound, seems to perform its initial function evaluations around the midpoint of the range, and doesn't use an initial guess per se.

Several possible approaches for addressing this were considered and discussed with @jwallen. We certainly think an argument can be made for reducing the allowable B range. B=1500 K seems like the most reasonable cutoff, as the training data goes up to 1500 K , meaning that the scaled y would take on maximum values of at least 0.5. There may also be an argument for lowering the lower bound to T=0 K so the initial guesses are closer to the 500 K "typical" value, but this could have unknown, undesirable side effects. So, the proposed solution is to use the bounds of the training data, 300 and 1500 K, as the (default) bounds for B with the fminbound optimizer. If further problematic cases arise, there may be an argument for setting the upper bound for B even lower for for fixing B at 500 K.

Other possible approaches include: -use "brute" optimizer (may be slower) -use an optimizer like fmin with an initial guess of B=500 K (the "typical" value) (no guarantee that I know of that this couldn't produce an arbitrarily high B value

I'm going to work on implementing proposed solution and testing on a couple of other cases, and I'll see how it goes.

ReactionMechanismGenerator / RMG-Py

Unrealistic ThermoData-to-Wilhoit fits when high B values are used #63