jamesorr / CO2SYS-MATLAB

CO2SYS software for MATLAB (or octave) to compute variables of ocean CO2 system
MIT License
29 stars 11 forks source link

[Suggestion] More robust root finding for pH evaluation #3

Open briochemc opened 2 years ago

briochemc commented 2 years ago

Not sure this is useful to anyone but me, but I found out that sometimes the Newton solver for evaluating pH hangs when the other tracer values are outside of some acceptable range. I think this might be due to the current backtracking implementation where the Newton step is halved until the pH step deltapH is less than 1:

https://github.com/jamesorr/CO2SYS-MATLAB/blob/b62ad9e3147ba5e95fb03c6b241d4afc57fe95ce/src/CO2SYS.m#L1525-L1530

However, I think that even an overshoot in deltapH of less than 1 can still be too large and result in some never-ending cycle (illustrated below).

I think this is because the criteria for reducing the step size should be on the next residuals. Thus, maybe this backtracking line search could be replaced with one for which the criteria would be on the next residual instead?

Another option would be to write a function that returns the residual for a given pH and then use an existing Newton-solver that already has that criteria implemented (e.g., C. T. Kelley's nsold.m, which uses a quadratic approximation of the residual along the Newton-step direction for a rather efficient backtracking IMHO).

briochemc commented 2 years ago

I now believe the actual issue is more likely due to the inexact slope.

In the Newton solve for pH from ALK, as per the code comments (and numerically confirmed below), the slope is inexact:

https://github.com/jamesorr/CO2SYS-MATLAB/blob/b62ad9e3147ba5e95fb03c6b241d4afc57fe95ce/src/CO2SYS.m#L1521-L1524

This is OK when PO4 is realistic. (The PO4 terms are ignored in the slope because they are supposedly orders of magnitude lower than other terms.) However, this might cause problems for (unrealistically) large PO4 values. (In my work I sometimes have to numerically evaluate pH for very large PO4 concentrations). Large PO4 seems to exacerbate the inexactitude of the slope. This can cause the Newton solver to get stuck similarly to the diagram in my comment above.

Here is a plot of the residuals and where I sort-of traced the newton iterations (light purple) looking for the pH solution where the residuals are zero for some given input values:

Screen Shot 2022-06-30 at 12 21 16 pm

This shows that at least N = 1000 iterations are taken inside CO2SYS's Newton solver. (CO2SYS hangs indefinitely if I don't put a limit on the # of iterations.) Considering how well-behaved the residual shape looks (monotonous although a little flat in some parts), I thought I'd test a more accurate slope. In black I show the same trace of newton iterates using Julia's ForwardDiff (machine-precision accurate), and it shows that for the same input parameters, it finds the pH solution in 7 steps. (Note I apply the same backtracking line-search, which is why traces don't align with the derivative of the residuals in both cases.)


FWIW, here are the slopes as computed by CO2SYS and by ForwardDiff:

Screen Shot 2022-06-30 at 12 37 12 pm

and here is the relative error, of about 95% through most of the pH range:

Screen Shot 2022-06-30 at 12 40 25 pm

To be clear, I don't really need to know what the pH is for very large PO4 values, I only need a numerical solution (which requires CO2SYS to not hang) even if I most likely will discard that value later in my modelling. Anyway, I will likely try and fix it on my side for my needs and might end up submitting a PR to fix it here. It cannot hurt to have an exact slope I guess!