Parallelized batch processing

juldaani commented 3 years ago

Hey,

I'm wondering whether it could be possible to implement multi-core implementation for processing multiple batches of data simultaneously.

My current approach is:

import pynanoflann
import numpy as np

n_batches = 10
target = np.random.rand(n_batches, 10000, 3)
query = np.random.rand(n_batches, 2000, 3)

for i in range(n_batches):
    pts_target = target[i]
    pts_query = query[i]

    kd_tree = pynanoflann.KDTree(n_neighbors=1, metric='L2', leaf_size=20)
    kd_tree.fit(pts_target)
    d, nn_idx = kd_tree.kneighbors(pts_query)

Instead, I'd like to do something like that:

import pynanoflann
import numpy as np

n_batches = 10
target = np.random.rand(n_batches, 10000, 3)
query = np.random.rand(n_batches, 2000, 3)

kd_trees = pynanoflann.KDTree(n_neighbors=1, metric='L2', leaf_size=20)
kd_trees.fit(pts_target)
distances, nn_indexes = kd_trees.kneighbors(pts_query)

This would create a kd-tree for each batch in 'target'. Corresponding batches in 'query' then would be used to make nearest neighbor searches with corresponding kd-trees.

Currently, if I want to implement this kind of parallelized processing I have do it in python. For large data sets this is not a problem but for smaller data sets it will cause too much overhead. I assume that it would be much faster to implement in c++ side of the code.

Best regards Juho

u1234x1234 commented 3 years ago

Hi Juho,

I get your point, but it seems like a custom solution, which will require additional efforts as there is no such option in the original nanoflann library, so I have no plans for this now. Contributions are welcome.

u1234x1234 commented 3 years ago

Do you need kd_trees after queries? Or would such an interface work too?

import pynanoflann
import numpy as np

n_batches = 10
target = np.random.rand(n_batches, 10000, 3)
query = np.random.rand(n_batches, 2000, 3)

distances, nn_indexes = batched_search(target, query, n_neighbors=1, metric='L2', leaf_size=20)

juldaani commented 3 years ago

Hey Dmitrii,

Thanks for the very fast response.

In my current application I don't need the trees so they can be thrown away.

u1234x1234 commented 3 years ago

I added Parallelized batch processing.

pip install git+https://github.com/u1234x1234/pynanoflann.git@0.0.6

Full example: https://github.com/u1234x1234/pynanoflann/blob/master/tests/test_batched_kneighbors.py

Internally it works for the list of numpy arrays of arbitrary shapes ie:

target = [
  np.random.rand(10000, 3),
  np.random.rand(5000, 3),
  np.random.rand(40000, 3),
]

I've implemented a more general interface at the cost of small overhead.

juldaani commented 3 years ago

Hey,

I tested the implementation and it works very well. No bugs found. And have to say the library is now very fast with these multicore capabilities. As far as I know this is the fastest kd-tree implementation for python.

This helped me a lot and made my algorithm many times faster. Thank you very much.

u1234x1234 commented 3 years ago

I'm glad it helped you.

u1234x1234 / pynanoflann

Parallelized batch processing #3