Closed juldaani closed 3 years ago
Hi Juho,
I get your point, but it seems like a custom solution, which will require additional efforts as there is no such option in the original nanoflann library, so I have no plans for this now. Contributions are welcome.
Do you need kd_trees
after queries? Or would such an interface work too?
import pynanoflann
import numpy as np
n_batches = 10
target = np.random.rand(n_batches, 10000, 3)
query = np.random.rand(n_batches, 2000, 3)
distances, nn_indexes = batched_search(target, query, n_neighbors=1, metric='L2', leaf_size=20)
Hey Dmitrii,
Thanks for the very fast response.
In my current application I don't need the trees so they can be thrown away.
I added Parallelized batch processing.
pip install git+https://github.com/u1234x1234/pynanoflann.git@0.0.6
Full example: https://github.com/u1234x1234/pynanoflann/blob/master/tests/test_batched_kneighbors.py
Internally it works for the list of numpy arrays of arbitrary shapes ie:
target = [
np.random.rand(10000, 3),
np.random.rand(5000, 3),
np.random.rand(40000, 3),
]
I've implemented a more general interface at the cost of small overhead.
Hey,
I tested the implementation and it works very well. No bugs found. And have to say the library is now very fast with these multicore capabilities. As far as I know this is the fastest kd-tree implementation for python.
This helped me a lot and made my algorithm many times faster. Thank you very much.
I'm glad it helped you.
Hey,
I'm wondering whether it could be possible to implement multi-core implementation for processing multiple batches of data simultaneously.
My current approach is:
Instead, I'd like to do something like that:
This would create a kd-tree for each batch in 'target'. Corresponding batches in 'query' then would be used to make nearest neighbor searches with corresponding kd-trees.
Currently, if I want to implement this kind of parallelized processing I have do it in python. For large data sets this is not a problem but for smaller data sets it will cause too much overhead. I assume that it would be much faster to implement in c++ side of the code.
Best regards Juho