Address numerical instability

iangrooms commented 3 years ago

@hmkhatri (or someone else? I can't recall) evidently found that our method is unstable at large n_steps and @NoraLoose then showed that changing the floating point precision affects the instability. Specifically, increasing (decreasing) precision makes the method more (less) stable. This demonstrates that it's really a roundoff problem. Numerical instability of time stepping methods is completely different and happens even in exact arithmetic.

I think the instability probably results from "catastrophic cancellation." Catastrophic cancellation happens when you subtract two numbers that are almost the same, and results in a loss of precision. To understand this, consider filtering an eigenvector of the discrete Laplacian q, such that Lq = -k_i^2q. At some step in the process you have to evaluate

q + (1/s_i)Lq =q - (k_i^2/s_i)q

If k_i^2 is almost equal to the polynomial root s_i, then you get catastrophic cancellation.

It's probably worse at larger n_steps because there are more roots s_i, and it's more likely that one of them is close to k_i^2. I also think it will be worse for large filter scales because the polynomial needs lots of roots in the range of resolved wavenumbers. It would be worth checking that the instability sets in at smaller n_steps when the filter scale increases.

The usual way to "fix" catastrophic cancellation is to use a different formula. I don't see a good way to do that for our filter, but at least I think this is the cause of the error. I will try to develop a numerical example to illustrate the problem.

Ultimately we need to address this somehow in the paper.

iangrooms commented 3 years ago

I worked on this today and discovered that it's not as simple as I thought, i.e not catastrophic cancellation. Our filter operates in stages. Each stage basically multiplies by $1-k^2/s_i$ in Fourier space where $s_i$ is a polynomial root. Some stages strongly amplify the small scales, while others damp them out. In exact arithmetic the small scales are damped at the end of the process. The stages that amplify small scales are also the ones that damp large scales, so these amplifying stages are a particular problem when our filter tries to damp out a very large range of scales.

If you put all the amplifying stages first then the small scales can grow by many orders of magnitude during these stages. No single amplifying stage causes much growth, but if you have (eg) ~20 amplifying stages, each of which amplifies by a factor of 4, then there's a huge net growth. The remaining stages try to damp the small-scales back out, but the huge disparity in amplitude between large and small scales leads to roundoff errors that contaminate the large scales. So even if you do manage to damp out the small scales, what's left at large scales is junk.

If you put all the amplifying stages last, then the first stages damp the small scales almost to zero, but not quite because of roundoff errors. Then the final amplifying stages can amplify the noise at small scales until it blows up the solution.

To fix (or at least ameliorate) the problem, you can interleave the damping and amplifying stages. Start with one that damps small scales, then do one that amplifies small scales, and so on until complete. The point is to avoid lots of back-to-back amplifying stages that would cause the small scales to grow by several orders of magnitude. If you damp then amplify, then damp, then amplify, the small scales never achieve a huge amplitude that introduces roundoff errors.

In my limited tests this significantly improves the numerical stability. I've posted a notebook here that explores and explains the behavior in a 1D periodic example. The notebook doesn't require the gcm-filters package, or even anything from this repository to run; you should be able to run it as-is using only numpy and matplotlib.

Do we want to implement this in gcm-filters? If so, then I guess we should open an issue there. This will require updating the guts of _compute_filter_spec. Some of the code from my notebook (linked above) could probably be re-used. It will also require updating _create_filter_func so that Laplacian and biharmonic steps can be interleaved.

Assuming we do want to implement this then I'll also have to add a section to the paper. Comments welcome! (Sorry for the long post.)

rabernat commented 3 years ago

Great work digging to the bottom of this issue! Your explanation makes sense to me.

Do we want to implement this in gcm-filters?

Definitely, yes! But we don't have to do it right away. Given limited time, getting this paper done should probably be the first priority.

iangrooms commented 3 years ago

OK, I will add a section on this topic to the paper, but will not hold the paper until it's into gcm-filters.

iangrooms commented 3 years ago

I've just added a notebook that creates a figure to go with the new section of the paper (on numerical stability). The section is still in progress, pending completion of the figure. The notebook is here. I'd be grateful if someone with more experience plotting in Python could massage the figure a little to make it more publication-ready, or at least just take a look and see if there are any quick and easy improvements that could be made.

NoraLoose commented 3 years ago

I gave this a quick go - let me know what you think. In the notebook, I experimented with some other color maps too.

instability_hot

iangrooms commented 3 years ago

The section in the paper is written and the figure has been added, so I'm going to close this issue. We still need to update the code in the gcm-filters package.

ocean-eddy-cpt / gcm-filters-paper

Address numerical instability #24