drprojects / point_geometric_features

Python wrapper around C++ utility to compute local geometric features of a point cloud
MIT License
56 stars 5 forks source link

pgeof2 #12

Closed rjanvier closed 7 months ago

rjanvier commented 7 months ago

This PR is a proposal. Some changes made to pgeof were needed for some usage that goes beyond the scope of the original implementation. We found them potentially useful for the user community of SPT/PGEOF so we wanted to share them and propose to integrate them to the upstream repo. If it seems too much to review / accept inside pgeof codebase we agree to maintain these changes as public and friendly fork.

Infrastructure / Architecture changes

New features and Code Changes:

Tests

Benchmarks:

Here are some benchmarks results. Running python 3.11 on linux on an intel intel 14700, GCC-13.2 compiler: image

drprojects commented 7 months ago

Hi @rjanvier

Thanks for the great work ! I have a few comments to make before accepting this PR. These are mostly documentation-related and not game changers for the code.

About pgeof2 naming

This is not a big deal, but I would rather keep pgeof as the name rather than pgeof2 if possible, and use a new versioning major to indicate changes to the library instead.

Would this cause too many issues with the way you built the tests ?

About the documentation

Overall, it would be good to have a bit more documentation for the library:

# Generate a random synthetic point cloud and k-nearest neighbors
num_points = 10000
radius = 0.1
xyz = np.random.rand(num_points, 3).astype("float32")
knn, _ = pgeof2.radius_search(xyz, xyz, radius, 50)

# Converting radius neighbors to CSR format
nn_ptr = np.r_[0, (knn >= 0).sum(axis=1).cumsum()]
nn = knn[knn >= 0]

About FRNN in SPT

I understand the interest of packaging K-NN and radius-NN functionalities along with this library. I am fine with making pgeof offer this functionality as an add-on for CPU-only applications, by depending on nanoflann if you benchmarked it as the best candidate.

Yet, I still hold what I mentioned in previous discussions: the GPU-based FRNN is still much faster than any CPU-based NN library I have benchmarked against, even including the CPU-GPU transfer time. Here is a little snippet to compare FRNN with sklearn. On my machine, FRNN runs more than 100x faster than sklearn.

from sklearn.neighbors import NearestNeighbors
from FRNN import frnn
import numpy as np
import torch
from time import time

# Generate a random synthetic point cloud and radius neighbors
num_points = 100000
radius = 0.1
k_max = 50
xyz = np.random.rand(num_points, 3).astype('float32')

# Compute radius neighbors with scikit
start = time()
rneigh = NearestNeighbors(radius=radius).fit(xyz).radius_neighbors(xyz)
print(f"scikit: {time() - start:0.3f}")

# Compute radius neighbors with FRNN (including the time for CPU -> GPU copy)
torch.cuda.synchronize()
start = time()
xyz = torch.as_tensor(xyz).cuda()
distances, neighbors, _, _ = frnn.frnn_grid_points(xyz.view(1, -1, 3), xyz.view(1, -1, 3), K=k_max, r=radius)
torch.cuda.synchronize()
print(f"FRNN: {time() - start:0.3f}")

That being said, I agree that the installation of FRNN is a bit tedious and wouldn't mind moving to another GPU-based solution for SPT in the future (have not investigated it yet).

About tests

Maybe a test could be added for pgeof.radius_search too ?

rjanvier commented 7 months ago

Hi Damien, thank you very much for your kind and constructive feedback.

I will only answer on the KNN/Radius search thing for now as other points are maybe more consensual.

First, your KNN example is biased at least in two-three aspects

That being said, I agree that FRNN is faster in 100% of cases (maybe like ~10/~20x), but that’s not my main point. What I tried to show in this specific thread is a transfer cost (I haven’t investigated yet, I must admit) that penalizes the whole FRNN process in SPT and make CPU KNN competitive overall. This specific transfer cost is not directly related to the KNN search itself, but the following AddTo transform that is way faster when the data is already on the CPU.

Maybe this transfer cost could be fixed and maybe it’s related to the specific configuration of the computer on which this test was launched… (I will try to reproduce it in another computer soon).

But don’t get me wrong, I’m mainly interested in a fast and simple CPU KNN alternative because one of the biggest strengths of SPT vs. other DL methods is its low resource consumption so it’s a very good candidate for a CPU only inference process (even CPU only training in some very specific cases).

For example people could have a lot more of RAM than VRAM. Of course SPT allows to work in a tiling fashion but running SPT data preparation on CPU could allow to work on bigger clouds on some computers. Another example is that torch with cuda is around 2+ Go dependencies and torch without is something more like 100-300Mo. CPU KNN lowers the install barrier of SPT and allows to embed it more easily (for example as a CloudCompare plugin). It also lowers the install barrier, because as you said FRNN could be difficult to install (for GPU alternative maybe look at KNN on torch geometric)… Without taking into account that a lot of people do not have access to a Nvidia GPU.

This CPU KNN and some fast Delaunay bindings (yet to be released) and you could have a data preparation for SPT that would be almost on par on CPU and GPU. As a practitioner, I think it’s very interesting and it can have a slight but maybe non negligible impact on the dissemination of your method.

rjanvier commented 7 months ago

Thanks again for this review; I already added the test you requested and I will correct the CSR sample.

About pgeof2 naming

I agree to come back to the pgeofname, but we could not benchmark and test anymore vs. previous versions of pgeof because we can’t have a two versions of the same package in the same PYTHONPATH. It would imply to launch test in two different environments and it wouldn't be convenient to compare the outcome of both versions. But in another hand, I think we have no regression vs. original version of pgeof so test are not mandatory anymore. I'm open to both solutions, please tell me what you prefer: pgeof name with no regression test or pgeof2 (or something else) with regression tests. Version number is already increased (0.1.0).

About the documentation

I would like to create a RTD but I think I do not have the time either. This PR is a part of a larger puzzle on our side and I think we could improve a bit the API in the long (after some usage) run so I don’t want to commit too much on the documentation for now. Also I’m waiting for nanobind version 2 for better stub generations. I will definitely end up to set up the RTD but maybe in few months, if it’s ok.

For now, I think if we detail the API extensively in the Readme it would clutter it a lot. Of course, we can list the different functions with small description of what they do in the Readme. But for extensive documentation help(pgeof2) is already working (modulo some typos I assume :)) and for example of usage we can redirect people to tests/benchmarks files (it’s already done in the PR) What do you think?

drprojects commented 7 months ago

Hi @rjanvier thanks for the good work and fast reply ! Below are my replies to your successive points.

About KNN benchmarking

I totally agree that a pure-CPU preprocessing would be great, even if at the cost of a small increase in preprocessing time. And even better, a pure-CPU SPT would simply be amazing ! I am happy to see that this may be on your personal roadmap, judging from your expertise with CPU code optimization.

Still, since K-NN and radius-NN are often bottlenecks in point cloud processing pipelines (along with voxelization and sampling), I am very curious to see which library is faster. I will try to benchmark nanoflann vs FRNN on my end at least. But this should not stop this PR, I think it is great to offer CPU-based neighbor search in pgeof with nanoflann.

About pgeof2 naming

I am fine with keeping pgeof and not testing regression against the previous version. I did not see any dramatic changes to the core algorithm, I trust that you kept the PCA features computation similar, so we should be good.

About the documentation

Agreed. Let's keep it to a minimum on the README. But still, would be good to have:

Also, I come back on what I said about documenting CSR conversion from scikit-learn neighbor searches. Let's remove those from the README and favor the documentation of your nanoflann-based utilities. Several reasons for that:

Does that sound reasonable to you ?

rjanvier commented 7 months ago

Thanks Damien, seems all ok on my side. I will try to make those changes as soon as possible.

drprojects commented 7 months ago

Awesome, I appreciate this very efficient collaboration !

rjanvier commented 7 months ago

Hi @drprojects, I think this is ready for another round of review.

drprojects commented 7 months ago

Looks good, thanks for these last changes and congrats for this new version !

rjanvier commented 7 months ago

Thanks. we have to adapt SPT according to the API changes. Should I make a PR?

drprojects commented 7 months ago

Should I make a PR?

I would greatly appreciate it.