Dealing with 'nan' values and RuntimeWarnings during optimisation

Stefan-Endres commented 8 years ago

I've started working on more polar systems which are so non-ideal that the Gibbs surface is not defined for most parameters (which incidentally also destroys a few proofs I was working on due to lack of continuity). This ruins the lagrange plane optimisation due to nan - nan behaving as a zero float when added to the objective function's summation (a scalar float output).

Essentially what I'm trying to do now is to catch any RuntimeWarning like:

ncomp.py:523: RuntimeWarning: invalid value encountered in log
  - s.m['a'] / (p.m['R'] * s.m['T'] * V))
G_sol_I = [ 0.32783229]
G_sol_II = [ nan]

Which can be done with exception handling with a few simple modifications: http://stackoverflow.com/questions/15933741/how-do-i-catch-a-numpy-warning-like-its-an-exception-not-just-for-testing ...and then add penalties to the objective function.

The question now becomes how to deal with this on a high level optimisation, the options as I see them are:

Catch the Warning on a high level function evaluation and add an arbitrarily high penalty to the objective function. Since there are an extremely large amount of cases where the function is not defined it will be difficult to know what is actually causing the error and the penalty with have to be arbitrary, although we might be able to find meaningful values by evaluating every single state variable.
Catch the Warning on a low level inside every subfunction of the Gibbs equation. This has the advantage in that we can applly more meaningful penalties (ex. abs([invalid log value])). However, this will require a major rewrite of most functions wherein we eiter
- Have several outputs to be passed to higher functions instead of the single floats most functions currently output.
- Wrap every function inside a class that has a built in penalty tracker which resets at every high-level evaluation.

Please let me know what you think.

alchemyst commented 8 years ago

Are there no thermodynamic limits that we can impose that will at least allow us to ensure a defined value for the Gibbs surface? We can track backwards from the terms that cause NaN values and impose constraints based on that. For instance, if there is a value with a square root, we can infer that the whole term under the surd is supposed to be positive. These constraints admit more direct handline. I think the "black box" approach of just evaluating the whole function is obviously useful since we have working versions of the code, but since we have the equations, we should be able to predict where the bad places are a bit better.

alchemyst commented 8 years ago

To be more direct: the bad things will only happen with logs and fractional powers. So find all those places and place limits on them.

Stefan-Endres commented 8 years ago

As an update I've since tried to catch any problematic parameters in the objective function based on the previous analytical work we did. I'm still getting a lot of issues mainly from the root solver (which is invoked when the analytical cubic solutions fail or when there are less than 3 volume roots above the "critical point" (as a function of the Van der Waals parameters/state variables, not the true data point).

Another issue is that the composition values occasionally steps slightly outside the bounds in local solvers (I've had this issue with both SLSQP and L-BFGS-B usually with overflow warnings etc.), so for example when $x_1$ steps out by even $10^{-10}$ then $x_2$ is calculated as negative in the state update function causing negative log values in the ideal gibbs energy function. I've since plastered over this issue, but I should probably investigate why this happens in the first place and to find a better solution than replacing it with arbitrarily small values.

We'll need to rethink some strategy before resorting to arbitrarily high exception errors as currently the (global+local) optimisations only converge for narrow bounds on the interaction parameters (ironically the r and s parameters can be allowed to vary several orders more than $k_{ij}$) and do not provide acceptable fits on all the systems we want. To avoid table top error functions we can make this error drive parameters towards zero, but I'm not sure how sensible that will be considering the point is to look for local minima in the extremes of the bounds.

It might also be possible to backtrack what's causing the errors and implement functional constraints (analytically rather than with the numerical exception handling) if that's worth the investment.

Stefan-Endres / DWPM-Mixture-Model

Dealing with 'nan' values and RuntimeWarnings during optimisation #40