Have RBFInterpolator check for an ill-conditioned matrix?

scipy / scipy

SciPy library main repository

https://scipy.org

BSD 3-Clause "New" or "Revised" License

13.07k stars 5.19k forks source link

Have RBFInterpolator check for an ill-conditioned matrix? #14150

Open treverhines opened 3 years ago

treverhines commented 3 years ago

When using a kernel that is not scale invariant for RBFInterpolator ('gaussian', 'multiquadric', 'inverse_multiquadric', 'inverse_quadratic'), the interpolant can become corrupted with numerical noise if epsilon is too small. This is a well-known issue with RBF interpolation, and it is not an issue with the implementation. However, I am wondering if a warning should be raised when the interpolant is ill-conditioned.

There are costs to adding this warning. Checking the condition number (using the same procedure as in scipy.linalg.solve) makes RBFInterpolator about 10-20% slower (based on the benchmark). Also, the warning may become a nuisance to users who are already aware that ill-conditioning is an issue with RBF interpolation. In particular, I can imagine this warning being a nuisance when using cross-validation to search for a good epsilon value.

If this is something we want (for 1.7 or 1.8), I have a branch with the warning implemented: https://github.com/treverhines/scipy/blob/rbfinterp-ill-conditioned-warning/scipy/interpolate/_rbfinterp.py

The below script demonstrates this ill-conditioning:

import numpy as np                                                                                                                                                                                          
import matplotlib.pyplot as plt                                                                                                                                                                             
from scipy.interpolate import RBFInterpolator                                                                                                                                                               

def test_function(x):                                                                                                                                                                                       
    y =  4.26*(np.exp(-x) - 4*np.exp(-2*x) + 3*np.exp(-3*x))                                                                                                                                                
    return y                                                                                                                                                                                                

rng = np.random.RandomState(0)                                                                                                                                                                              
xobs = rng.uniform(0.0, 3.0, (20, 1))                                                                                                                                                                       
xitp = np.linspace(0.0, 3.0, 1000)[:, None]                                                                                                                                                                 

yobs = test_function(xobs[:, 0])                                                                                                                                                                            
ytrue = test_function(xitp[:, 0])                                                                                                                                                                           

yitp = RBFInterpolator(                                                                                                                                                                                     
    xobs,                                                                                                                                                                                                   
    yobs,                                                                                                                                                                                                   
    epsilon=0.1,                                                                                                                                                                                            
    kernel='gaussian',                                                                                                                                                                                      
    )(xitp)                                                                                                                                                                                                 

plt.figure(1)                                                                                                                                                                                               
plt.plot(xitp[:, 0], ytrue, 'k--', label='truth')                                                                                                                                                           
plt.plot(xitp[:, 0], yitp, 'C0-', label='interpolant')                                                                                                                                                      
plt.plot(xobs[:, 0], yobs, 'ko', label='observed')                                                                                                                                                          
plt.xlim(0, 3)                                                                                                                                                                                              
plt.grid()                                                                                                                                                                                                  
plt.legend()                                                                                                                                                                                                
plt.show()

Figure_1

tupui commented 3 years ago

Interesting, this seems useful. As there is quite some performance penalty, I would suggest to add a parameter to check this or not. By default it could do the check and any power user would want to look at all the parameters and could disable it.

treverhines commented 3 years ago

Another issue with adding a warning about the condition number is that a large condition number (>1e16) does not always indicate a numerically unstable solution. For example, in the branch I mentioned above, a warning will be thrown when using 'multiquadric' for kernel and a very large epsilon, even though there may be no sign of numerical noise in the solution. Similarly, the condition number does not seem to be useful (at least by itself) for determining whether the solution is stable when kernel is 'linear', 'thin_plate_spline', 'cubic', or 'quintic'. My branch does not check the condition number when using one of those kernels, which means that the system can be ill-conditioned and a warning will not be raised.

My point is that it seems tricky to come up with a reliable method of warning the user if and only if the solution is unstable. I will have to think about this some more before I make a PR.

ilayn commented 3 years ago

I don't know the context of this problem but is this ill-conditioned with respect to inversion? The rcond check is an estimate and does not provide a robust metric hence the vague warning in the linalg.solve. If there is a decent way to detect instability that would be my option. Or checking condition number after the detection of instability (assuming that it is feasible)?

treverhines commented 3 years ago

I don't know the context of this problem but is this ill-conditioned with respect to inversion?

Yes. I think, in essence, my challenge is determining whether the following block matrix system of equations is ill-conditioned:

[ K   P ] [a]   [d]
[ P^T 0 ] [b] = [0]

where K is an (n, n) matrix, P is (n, m), and m <= n. P is normalized to have values between -1 and 1. The values in K are not normalized. I am noticing that when the values of K are very large or very small, the condition number of the matrix can be large (>1e16); however, the solution for a and b is still accurate to machine precision. I could divide K by its maximum absolute value (and then correspondingly scale the solution for a), which reduces the condition number and makes it a more accurate indicator of whether the solution for a and b is stable. However, it is quite a bit of extra work to normalize K, and it seems like there should be a smarter solution.

If there is a decent way to detect instability that would be my option. Or checking condition number after the detection of instability (assuming that it is feasible)?

AFAIK, the condition number is the best way to detect an unstable RBF interpolant, despite its limitations.

ilayn commented 3 years ago

I could divide K by its maximum absolute value (and then correspondingly scale the solution for a), which reduces the condition number and makes it a more accurate indicator of whether the solution for a and b is stable. However, it is quite a bit of extra work to normalize K, and it seems like there should be a smarter solution.

If I understand you correctly you are looking at scipy.linalg.matrix_balance. That is used by practically all eigenvalue problems as a preconditioner

treverhines commented 3 years ago

If I understand you correctly you are looking at scipy.linalg.matrix_balance. That is used by practically all eigenvalue problems as a preconditioner

That sounds almost like what I am talking about, except I am normalizing a block rather than rows/columns. Here is an example of a system that is well-posed but has a poor condition number:

K = 1e10*np.random.random((5, 5))
P = np.random.random((5, 2))
d = np.random.random((5,))
LHS = np.block([[K, P], [P.T, np.zeros((2, 2))]])
rhs = np.hstack((d, np.zeros((2,))))

np.linalg.cond(LHS) gives me about 3.4e21. When I run scipy.linalg.matrix_balance on LHS, it gives me the same matrix back. In my above comment, I was suggesting normalizing the K block in LHS as

scale = np.max(np.abs(LHS[:5, :5]))
LHS[:5, :5] /= scale

which gives me a condition number of about 155.0 for LHS. I then scale the solution as

soln = np.linalg.solve(LHS, rhs)
a, b = soln[:5], soln[5:]
a /= scale

I am not happy with this rescaling approach because it is a lot of extra work to give a nicer condition number, but it does not make the solution any more stable (the solution with and without rescaling are equal to within about 1e-14).

ilayn commented 3 years ago

That's expected since your conditioning comes from the structure of the problem which is essentially

[ 1e10  1 ] [a]   [d]
[ 1     0 ] [b] = [0]

which is more or less equivalent to

[ 1      1e-10 ] [a]   [e]
[ 1e-10      0 ] [b] = [0]

This is destined to be ill-conditioned if you don't do the extra work. Otherwise you are practically solving b = d and contribution of a can be neglected

ev-br commented 2 years ago

So what would be a conclusion here Trever, Ilhan, Pamphille?

ilayn commented 2 years ago

Unfortunately, this is a structural problem and whatever is going to be implemented will solve some of the cases and leave the rest out. Hence, if there is a sweet spot between performance and convenience, I am OK with whatever the consumers of this tool choose. I laid out a generic linear algebra issue and I don't want to claim that I knew anything about RBFInterpolator when I typed things above. In general, it seems to me, that some tweaking is unavoidable nevertheless.