Closed WangZP10969 closed 1 year ago
@mvieth Thoughts?
Interesting idea! So your algorithm would give the same results as the current euclidean clustering method in PCL, but faster in certain conditions? One thing I don't understand: In step 2.3, why do you search for sampling points within 3 times the threshold distance?
I think there are basically three things to do:
extractEuclideanClusters
function. Then you would create a pull request with the new implementationBased on the results, we would decide whether the algorithm would fit into PCL, and in which way. Does this sound good to you?
@mvieth Thank you very much for your reply.
This method is inspired by the pyramid acceleration process, which can reduce the amount of calculation by clustering the sample points of the point cloud to represent all points.
However, the closest distance of the sampling point pair cannot be directly described by the distance of the sampling point to itself, because the neighborhood points around the sampling point are distributed in the sphere of the threshold distance radius, and the closest distance is the distance between the closest points of the two spheres.
Extracting the sampling point pair less than 3 times the threshold distance is to ensure that the nearest distance of the neighboring points of the two sampling points may be less than the threshold distance, because the sampling point pair greater than 3 times the threshold distance, the closest distance of all neighboring points must be greater than the threshold distance.
The sampling algorithm in the first step ensures that the distance between any two sampling points will not be less than the threshold, because the neighboring points of the sampling points will be marked first, and the subsequent sampling points cannot be generated from these neighboring points.
The blue point is the selected sampling point, a spherical neighborhood with a threshold radius, the red point in the neighborhood has been marked and cannot be used as a new sampling point candidate, so new sampling points can only be generated from green.
In the sphere neighborhood of two sampling points, the closest distance between the center of the sphere will not be less than the threshold distance.
and in the most extreme case, the center of the sphere is 3 times the radius, and the distance between neighboring points can be the threshold distance.
I am not sure about the performance of this method in PCL test cases, but I used the same method of defining functions in other algorithm platforms for comparison, and got the same results in several test data. I think this method is equivalent to Euclidean clustering in principle. The methods of other platforms are very fast, especially on large neighborhoods, I'm not sure how it is implemented, but certainly not all points have been searched for neighborhood.
I think this is a different implementation of European clustering, similar to the KdTree and OcTree included in Search. Perhaps we can add different functional interfaces to the European clustering class to call the difference method, such as sampling European clustering, or approximate sampling Euclidean clustering (do not calculate the closest distance, directly include the neighborhood of sampling points within 3 times the distance).
I used PCL's KdTree nearest neighbor search to initially complete this function. The speed of sampling points and clustering sampling points will become faster as the threshold distance increases, but the nearest distance to the neighboring points of the sampling point pairs takes a lot of time. , This still makes the cluster segmentation under a larger threshold distance very time-consuming.
If the closest distance of the sampling point pair is not considered, then this method will become an approximate Euclidean clustering segmentation, which can be used when some dense point clouds do have a large closest point distance.
I will try to use KdTree to speed up the search for the current sampling point, so that the worst time complexity can be further optimized The total number of points is N, the number of neighborhood points of the sampling point is K, and the number of sampling points in the neighborhood of 3 times the distance of the sampling point is T (usually 12-18) Linear distance calculation: the number of sampling points (N/K) the number of sampling points in the neighborhood (K) the number of sampling points in the neighborhood 3 times the distance of the sampling point (T) the number of sampling points in the neighborhood (K) = N/KK TK=NKT KDTree closest distance calculation: the number of sampling points (N/K) the number of search points in the neighborhood of the sampling point (log(K)) the number of sampling points at 3 times the distance of the sampling point (T) the number of sampling points in the neighborhood (K) =N/Klog(K)TK=Nlog(K)T
I tried to optimize the closest point search, and further reduced some candidate points through the projection distance.
I try to project all the points on the line of the two sampling points, and then find the point with the closest projection distance
Calculate the spatial distance of the pair of points, and then expand to this spatial distance on both sides of the projection distance to include all candidate points
For the test, I used two kinds of point clouds that I often deal with. One is that the plane surface has a lot of bumps, and the other is a few objects that are relatively close. The speed improvement effect is as follows. In the case of large neighborhoods, There is a certain reduction in running time
Time1 is the new method, and Time2 is the traditional method, Other platforms as a reference.
The test results show that there is a certain improvement in the large neighborhood of the traditional method, but there is still a big gap with other platform methods. I am trying to further optimize
The attachment is the test code and test data, any suggestions are very welcome ConnectionDistance3d.ply.txt ConnectionMap.ply.txt Test.cpp.txt
I am using PCL’s Euclidean clustering method, which is usually used to segment several dense point clouds with larger distances, but the PCL method takes longer to run on larger neighborhoods
I am trying a new method, first sampling the point cloud by a threshold distance, and then performing Euclidean clustering for the sampling points. The clustering condition first satisfies the case where the distance between the sampling points is less than 3 times the distance threshold, that is, the distance between the two sampling points is the farthest point in the neighborhood, and the farthest points distance is just the threshold. Then, the nearest distance is calculated for the neighboring points of these sampling points. If the nearest distance is less than the threshold, the sampling point is added to the cluster.
I tested it on a scripting language platform. The average speed on the large neighborhood is faster than the traditional method, but the coefficient on the small neighborhood is higher than the traditional method. I am trying to port to the PCL-based C++ algorithm. I hope to get some suggestions, such as Whether this method is valuable, and an efficient solution for finding the closest point of two point clouds.
The goal of Euclidean clustering is a connected domain with distance, that is, when the distance between two points is less than the threshold, they are considered to be in the same cluster
The method adopted by PCL is,
This scheme is very simple and robust, but the time complexity is, the neighborhood calculation of all points Assuming that the number of points is N and the neighboring points are K, it is Nlog(N)K For a flat point cloud, the growth of K is the neighborhood radius R^2, that is, as the threshold increases, the time is a square-level increase, so the Euclidean clustering time under a large neighborhood is very long
In this method, assume that the number of points is N, the neighborhood point is K, and the sampling point neighborhood is T In the sampling process, the time complexity is N/Klog(N)K = Nlog(N) Clustering process, the sampling points are clustered as Klog(K)T, and the worst case of judging the nearest distance of the neighboring points of the sampling point is N/KKKT = NKT, which is N/K samples Point, each sampling point has been calculated KKT times to get the closest distance, but generally speaking, the result can be obtained at the KT level, that is, the usual complexity is N/K KT=NT
PCL method complexity, Nlog(N)K Sampling point method complexity, Nlog(N)+Klog(K)*T+(NKT~NT) T is the sampling point within 3 times the distance of the sampling point, which is approximately constant, generally much smaller than K Therefore, when K is large, the sampling point method has certain advantages