Closed juldaani closed 3 years ago
Hi Juho,
I've added an option for multicore querying kd_tree.kneighbors(pts_query, n_jobs=4)
:
pip install git+https://github.com/u1234x1234/pynanoflann.git@0.0.4
Full example: https://github.com/u1234x1234/pynanoflann/blob/master/tests/test_multithreaded.py#L19
The speed-up mostly depends on the number of queries. In the case of a small query set (2000, like in your example) it is slightly faster than single core performance, but for large query sets (I tried few millions of points) the speedup is almost linear.
Thanks for the fast response.
I will test this today after the dinner.
I think you are right that for small queries the performance doesn't differ much between multi and single core. However, I'm also using kd-trees for very large point clouds so this feature is more than welcome. Thanks a lot.
I tested the feature and it seems that there is a minor bug in the indexes.
If i run the following code the indexes do not match:
import pynanoflann
import numpy as np
a = np.array([[-1.65822375, 8.29091549, 1.69109333],
[-2.55392027, 9.14873791, 0.57508534],
[-0.48824483, 12.70754147, -0.77094924],
[ 2.60444188, 1.768641 , 1.20891106],
[ 3.2674644 , -4.37506008, -0.87698722],
[-5.2069087 , 7.67119074, -1.28311527],
[11.18597317, -5.59029245, 0.74802291],
[ 1.45142472, 5.49197388, 2.08605385],
[ 4.44315147, 12.89360046, 1.21553278],
[ 1.40066028, 6.81189775, 1.32086086],
[-0.96798885, 8.48570728, -1.57162476],
[ 7.39387703, 2.37927413, 1.0226326 ],
[ 0.9753269 , 8.57738113, 1.75330448],
[ 4.06463003, 11.8688879 , 0.10702024],
[-2.6403656 , 1.09355235, -1.06388557],
[-4.55271149, 9.36354256, 2.45670676],
[-3.24715614, -1.84484696, -0.88164014],
[ 0.67729777, -1.50539744, 0.43235415],
[-5.5375247 , 7.23835421, -0.67688894],
[-1.07082331, -3.00007129, -1.66152179],
[ 3.6330297 , -4.45702457, -0.62901032],
[ 1.88146007, 15.80526638, 1.91470706],
[ 6.26283598, 5.25627804, 0.94044268],
[ 7.60514402, -5.0185051 , 0.18425676],
[-0.50298601, 13.87367153, -1.1920346 ],
[-2.86218667, 5.47483587, -1.24373996],
[ 0.54232329, 15.88754368, 0.27608338],
[-3.91043758, 7.08590221, 2.5814743 ],
[-3.41587186, 8.19709778, 0.76717484],
[-0.99566239, 5.38674688, 1.80337858]])
b = np.array([[-0.96798885, 8.48570728, -1.57162476],
[ 7.39387703, 2.37927413, 1.0226326 ],
[ 0.9753269 , 8.57738113, 1.75330448],
[ 4.06463003, 11.8688879 , 0.10702024],
[-2.6403656 , 1.09355235, -1.06388557],
[-4.55271149, 9.36354256, 2.45670676],
[-3.24715614, -1.84484696, -0.88164014],
[ 0.67729777, -1.50539744, 0.43235415],
[-5.5375247 , 7.23835421, -0.67688894],
[-1.07082331, -3.00007129, -1.66152179],
[ 3.6330297 , -4.45702457, -0.62901032],
[ 1.88146007, 15.80526638, 1.91470706],
[ 6.26283598, 5.25627804, 0.94044268],
[ 7.60514402, -5.0185051 , 0.18425676],
[-0.50298601, 13.87367153, -1.1920346 ],
[-2.86218667, 5.47483587, -1.24373996],
[ 0.54232329, 15.88754368, 0.27608338],
[-3.91043758, 7.08590221, 2.5814743 ],
[-3.41587186, 8.19709778, 0.76717484],
[-0.99566239, 5.38674688, 1.80337858]])
kd_tree = pynanoflann.KDTree(n_neighbors=1, metric='L2', leaf_size=20)
kd_tree.fit(b)
d, nn_idx = kd_tree.kneighbors(a)
d2, nn_idx2 = kd_tree.kneighbors(a, n_jobs=4)
assert np.allclose(d, d2)
assert (nn_idx == nn_idx2).all()
Thank you for the feedback. Fixed:
pip install git+https://github.com/u1234x1234/pynanoflann.git@0.0.5
I tested the fix and it seems to work.
Thank you very much.
Feel free to reopen if you find any problems.
Hey,
Another feature I've in mind is if it is possible to implement parallellized multicore processing for queries. For instance, in Scipy cKDTree (https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.query.html#scipy.spatial.cKDTree.query) you can define number of cores with n_jobs parameter.
What I mean is that we could set 'n_cores' parameter when executing the query:
This would create a single kd-tree for the data in 'target' and then split the data in 'query' for multiple cores to do the queries.
Python implementation of this will introduce quite much overhead when working with relatively small data sets. I assume that this would be much faster when implemented in c++ side.
Best regards Juho