EconForge / interpolation.py

BSD 2-Clause "Simplified" License
123 stars 35 forks source link

Interpolation on a 2D cloud of points #79

Closed thomasaarholt closed 3 years ago

thomasaarholt commented 3 years ago

In my current work, I have a large (~1e6) number of 2D real coordinates with an intensity (X, Y, I) (i.e. (0.1, 0.43, 5)). From them I am interpolating values on a regular grid with shape of about (1000, 1000). To do this, I'm using scipy's grid_data. Typical usage with a lower number of points might look like:

import numpy as np
from scipy.interpolate import griddata
points = np.random.random((1000,2)) # generate point cloud
values = np.random.random(1000) # generate value associated with each point
grid = (np.linspace(0,1,100), np.linspace(0,1,100)) # positions to be interpolated on
new_values = griddata(points, values, grid) # perform interpolation

Is interpolation.py capable of solving this kind of problem? I suspect it can't since there's no mention of delaunay tesselation in the repo, but I thought I should ask!

albop commented 3 years ago

Thanks for your interest! No, there is no interpolation on gridded data. In principle it would be possible to write numba code to speed up the evaluation phase and increase performance for evaluating at a very large number of points. But it wouldn't change the expensive task of performing the tesselation. In the example you give, it would probably not be worth it.

thomasaarholt commented 3 years ago

Indeed, you're right about the tesselation being the expensive part! It's taking the majority of the time for interpolating new points for me! At least I have been able to take apart the griddata function and reuse the bits that I can. Thanks for your comment!

thomasaarholt commented 3 years ago

I worked out my issue. The problem is that my array of point positions wasn't C-contiguous. I must have tried it on my own data immediately, and not noticed that the original example worked. Here is an example of it failing. A solution is to take a copy (points_B.copy()) to ensure the memory is contiguous.

points = np.random.random((10,10,2)) # random point data
values = points.mean(-1).copy()

# Reshape points c order (default)
points_A = points.reshape((-1, 2))
print(points_A.flags.c_contiguous, points_A.flags.f_contiguous)
# False

# Reshape points fortran order (can happen sometimes - I didn't explicitly call this in my code)
points_B = points.reshape((-1, 2), order='F')
print(points_B.flags.c_contiguous, points_B.flags.f_contiguous)

eval_linear(((0, 1, 10), (0,1,10)), values, points_A) # succeeds
eval_linear(((0, 1, 10), (0,1,10)), values, points_B.copy()) # succeeds
eval_linear(((0, 1, 10), (0,1,10)), values, points_B) # fails
thomasaarholt commented 3 years ago

Reopening in case you want to do something about it. This is an edge case, so I'm not fussy.

albop commented 3 years ago

@thomasaarholt : in the master branch, eval_linear should work with any data ordering.

thomasaarholt commented 3 years ago

Excellent! Looking forward to checking it out!