treverhines / RBF

Python package containing the tools necessary for radial basis function (RBF) applications
MIT License
220 stars 51 forks source link

Reuse interpolator for repeated evaluation of different variables #15

Open Aquan1412 opened 4 years ago

Aquan1412 commented 4 years ago

Is there a way to reuse the rbf interpolator (or parts of it) to speed up repeated evaluation of different variables between the same two meshes/point distributions? I want to interpolate results from a numerical simulation between to different grids. The two grids are always the same, but since there are several variables, currently I have to run the whole interpolation in a loop:

for key in oldMesh.keys():
    interpolator = RBFInterpolant(
        np.array([oldMesh["X"], oldMesh["Y"]]).T, 
        oldMesh[key], 
        phi="phs3", 
        order=1, 
        sigma=0.0001,
    ) 
    interpolationPositions = np.reshape(
        (xx,yy), (2, nInterpolationPoints*nInterpolationPoints)
    ).T
    fieldInterpolated = interpolator(interpolationPositions)
    newMesh[key] = fieldInterpolated.flatten()

Is there a way to speed this up?

treverhines commented 4 years ago

It is definitely possible to make this faster. When we first instantiate the RBFInterpolant, we can cache the LU decomposition and reuse it to build the interpolant for new variables. I added this functionality to a new branch called "faster_rbf_interpolation". In this new branch RBFInterpolant has a method called fit_to_new_data (I am open to other name suggestions), where you can fit a new interpolant assuming that the observation points are the same as when you instantiated it. So your code would look something likes this:

interpolator = None
for key in oldMesh.keys():
    if interpolator is None:
        interpolator = RBFInterpolant(
            np.array([oldMesh["X"], oldMesh["Y"]]).T, 
            oldMesh[key], 
            phi="phs3", 
            order=1, 
            sigma=0.0001,
        )
    else:
        interpolator.fit_to_new_data(oldMesh[key])

    interpolationPositions = np.reshape(
        (xx,yy), (2, nInterpolationPoints*nInterpolationPoints)
    ).T
    fieldInterpolated = interpolator(interpolationPositions)
    newMesh[key] = fieldInterpolated.flatten()

As for the evaluation points, it is a bit trickier. You could potentially save the RBF values at the interpolation points. I am envisioning that this could be done by memoizing the __call__ method of the RBF class. I would have to think about this some more.

For now, could you test out the new branch and let me know if that helps you out?

Aquan1412 commented 4 years ago

Thank you for your quick reply and the new functionality. I just tried it for one of my cases, and it reduced runtime for the interpolation by a factor of four (for eight variables), so it seems to work well!

Concerning your question about the evaluation points, I'm not sure I understand what you mean. Do you want to reuse part of the interpolator even for different evaluation points? That would be nice to have, but for me the new branch already helped substantially.

treverhines commented 4 years ago

That is great to hear. There are a couple more things that I want to do to the new branch and then I will integrate it into master. For example, I want to add a flag in __init__ indicating whether or not to save the LU decomposition, since it can take up a lot of space.

I should have better explained my idea about the evaluation points. When you call the interpolant, a lot (or maybe most) of the time is spent evaluating the function rbf.basis.phs3 at your interpolation points. Since you are interpolating at the same points for each iteration, we can save and reuse the values of rbf.basis.phs3 at the those points. I am going to try implementing this, and I will get back to you.

treverhines commented 4 years ago

Can you try the following code? This is my hacky solution to speed up evaluating the interpolants at the new mesh points

from rbf.basis import phs3
from rbf.utils import MemoizeArrayInput

# cache a numerical function for phs3 in two-dimensional space
phs3._add_diff_to_cache((0, 0))
# memoize the numerical function that was just created
phs3._cache[(0, 0)] = MemoizeArrayInput(rbf.basis.phs3._cache[(0, 0)])

interpolator = None
for key in oldMesh.keys():
    if interpolator is None:
        interpolator = RBFInterpolant(
            np.array([oldMesh["X"], oldMesh["Y"]]).T, 
            oldMesh[key], 
            phi="phs3", 
            eps=np.ones_like(oldMesh["X"]), # need to specify eps for memoization to work
            order=1, 
            sigma=0.0001,
        )
    else:
        interpolator.fit_to_new_data(oldMesh[key])

    interpolationPositions = np.reshape(
        (xx,yy), (2, nInterpolationPoints*nInterpolationPoints)
    ).T
    fieldInterpolated = interpolator(interpolationPositions)
    newMesh[key] = fieldInterpolated.flatten()
Aquan1412 commented 4 years ago

First, thanks for the explanation, now I better understand what you try to achieve. Second, I just tested the new version, and now the speedup compared to the original loop is a factor of ~7.7 (for eight variables), pretty impressive!

One question about the cache: this probably depends on the dimension of the problem, so if I want to use it for 3D interpolation I would change it to the following, correct?

# cache a numerical function for phs3 in two-dimensional space
phs3._add_diff_to_cache((0, 0, 0))
# memoize the numerical function that was just created
phs3._cache[(0, 0, 0)] = MemoizeArrayInput(rbf.basis.phs3._cache[(0, 0, 0)])
treverhines commented 4 years ago

That is correct.