tataratat / napf

nanoflann wrapper for python and maybe fortran
https://tataratat.github.io/napf/
MIT License
4 stars 2 forks source link

How to convert quickly to a numpy array? #38

Open thechosenone98 opened 1 month ago

thechosenone98 commented 1 month ago

Your implementation is the fastest I've found but I'm having trouble converting your output type to an homogeneous numpy ndarray. What I'm trying to get is the result to a radius search as a 2D array where each row is a set max length (could be the lenght of the longest entry, I don't mind) but currently I have to resort to something like this:

def napf_result_to_numpy(result, max_knn):
    # Determine the number of elements and allocate the output array
    len_result = len(result)
    output_array = np.full(
        (len_result, max_knn), -1, dtype=np.int64
    )  # Pre-fill with a default padding value

    # Convert result to a flat list and calculate lengths
    flat_result = [item for sublist in result for item in sublist[:max_knn]]
    lengths = np.array([min(len(sublist), max_knn) for sublist in result])

    # Create row and column indices
    row_indices = np.repeat(np.arange(len_result), lengths)
    col_indices = np.concatenate([np.arange(length) for length in lengths])

    # Set the values in the output array
    output_array[row_indices, col_indices] = flat_result

    return output_array 

and for large outputs this is very slow. For an idea of how slow, I am currently in the need to get the ball neighbours of one million 3D points and this take ~400ms using this:

kdt = napf.KDT(tree_data=data, metric="L2")
indices, dist = kdt.radius_search(queries, radius, return_sorted=False, nthread=-1)
return indices

But the conversion takes a whopping 6 seconds 🫠

If someone knows a faster way, I am all ears.

Thank you 😄

j042 commented 1 month ago

Hi :) I think it's best to implement/extend this on c++ side. 1) Keeping longest entry - as we don't know the longest entry a priori, we will have to find max length during query loop and set results aside, then loop again to fill out the results. 2) Setting max length - this, on the other hand, won't require an additional loop, thus should be faster than 1). My first thought would be: at each query, sort results -> return n_results in ascending order. However, based on the examples you've shown (return_sorted=False), you are interested in any n_results within the query, is this correct? In this case, one can write a custom ResultSet to exit the search after finding n_results. I guess this will be helpful if you have a lot of dense clusters.

I may find some time next week. Feel free to create a pull request meanwhile :)

thechosenone98 commented 1 month ago

Thx for the quick response :)

I think that it should be possible to pass in a maximum number of neighbors as a parameter like open3d hybrid search allows and return a 2D array with that row length set by this parameter (all remaining empty spots should be filled with -1) Like this:

kdt.radius_search(queries, radius, max_knn=8, return_sorted=False, nthread=-1)

[[10, 15, 16, 25, -1, -1, -1, -1],
 [48, 23, 85, -1, -1, -1, -1, -1],
 [2,  -1, -1, -1, -1, -1, -1, -1]]

Oh that's nice, I'll definitely give that a try too :)

j042 commented 5 days ago

nanoflann already had a perfect function for this. Can you try and see if you gain any performance with #39?