Open fghoussen opened 6 days ago
Forgot to mention, at my side, on several different point clouds as inputs, even with this fix, crashes occurs around here: https://github.com/fghoussen/pcl/blob/d257a89130386c3356fba8ba0f04e8e2089704f3/registration/src/correspondence_rejection_median_distance.cpp#L59
Not sure how this is safe when distance are max'ed by bad correspondence score
CorrespondenceRejectorMedianDistance
is successfully tested here: https://github.com/PointCloudLibrary/pcl/blob/master/test/registration/test_registration.cpp#L262 source, target, etc are set within ICP (for the source this is strictly necessary since it changes from iteration to iteration).
In any cases, the attached patch seem to make things better (less often crash) but does not solve the problem (crash occurs on a regular basis).
If you say that it does not really solve the problem, I would rather investigate how the crash actually happens, and then create a fix based on that. Can you provide a full program that I can use to easily reproduce the crash? Including source and target cloud, e.g. from the pcl test directory,
With my data, following examples you mentioned, I get crash on a regular basis. This patch (with added commits) seems to make thing a lot-lot more stable (does not crash anymore after 15+ successive try): this is only checking bounds before accessing elements in lists.
Can you provide a full program that I can use to easily reproduce the crash? Including source and target cloud
Unfortunately not that easy: I can't provide data (not mine - need to check if I can share them) and they are big files (not great to commit in repo IMO), the original code can not be compiled in full debug (= difficult to analyse crash as you do not get all symbols)
With my data, following examples you mentioned, I get crash on a regular basis. This patch (with added commits) seems to make thing a lot-lot more stable (does not crash anymore after 15+ successive try): this is only checking bounds before accessing elements in lists.
I don't think it is a good idea to merge a change if the bug is not really understood. This seems to treat the symptoms, but potentially does not solve the bug itself. In the worst case, these checks could hide a more severe bug somewhere else in the code.
Can you provide a full program that I can use to easily reproduce the crash? Including source and target cloud
Unfortunately not that easy: I can't provide data (not mine - need to check if I can share them) and they are big files (not great to commit in repo IMO), the original code can not be compiled in full debug (= difficult to analyse crash as you do not get all symbols)
In case you can't share the data, you could try to use a point cloud from the pcl test directory ( https://github.com/PointCloudLibrary/pcl/tree/master/test ) or from the pcl data repo ( https://github.com/PointCloudLibrary/data ). How many points do the point clouds have that you use? I am asking because if it is a certain number (several millions or even billions), this might point to a specific kind of bug.
Since CorrespondenceRejectorMedianDistance
seems to be used successfully in test_registration.cpp, I think the main question is: what is causing the bug for you? Is it the data you are using? The parameters/settings you are using? Something else? I can only help investigating the bug if you help me reproduce it. If you could share a more complete program to reproduce the bug (starting from loading the point clouds from files to calling align, including all parameters you set), that would already be very helpful.
I'll see what I can do... Look closely at the patch (hidding whitespaces): this is only bound checking
I don't think it is a good idea to merge a change if the bug is not really understood.
I understand your point of view. But the problem is now understood: after bound checking, the code is stable.
I will likely not be able to go deeper in the analysis: gdb reported index = 26879
and (*input_).size() = 15698
on my data... And I have no idea why?!.. But a least, with this patch, the code doesn't crash anymore
In case you can't share the data
Not sure. Need to check. This will be time consuming too...
what is causing the bug for you? Is it the data you are using?
The problem seems to lie in the data (= point cloud). Reducing point cloud to minimal setting is impossible (I do not know what trigger the problem in the PCL code) and modifying point cloud will be time consuming too (possibly making the problem disappear).
you could share a more complete program to reproduce the bug
Once again, not sure I'll have time for this: I already spent more time than expected. I'll see what I can do...
Adding a unit test with data from https://github.com/PointCloudLibrary/pcl/tree/master/test may or may not reproduce the problem. Could you add one on top of this PR? If not, I'll see if I have time... But I can't guarantee
The problem is maybe silent in the current test suite: how to run the test suite?
cmake -DBUILD_benchmarks=ON -DBUILD_examples=ON -DBUILD_simulation=ON -DBUILD_global_tests=ON ..; make all test
does not trigger the test suite
The problem is maybe silent in the current test suite: how to run the test suite?
cmake -DBUILD_benchmarks=ON -DBUILD_examples=ON -DBUILD_simulation=ON -DBUILD_global_tests=ON ..; make all test
does not trigger the test suite
cmake --build . -- tests
or make tests
The problem is maybe silent in the current test suite
I've just tested: the test suite doesn't see the problem which confirm it seems related to the point clouds used as input
[BUG FIX] getCorrespondenceScore must not go out-of-bounds.
I have a quite systematic crash when using
pcl::registration::CorrespondenceRejectorMedianDistance
orpcl::registration::CorrespondenceRejectorSurfaceNormal
.I use it like so:
If any mistake (explaining crashes ?) is obvious, I'd be glad to know. In any cases, the attached patch seem to make things better (less often crash) but does not solve the problem (crash occurs on a regular basis).
The patch is basically "each time I access a member, check the index is consistent with the member size, or, return default value (here maximum value to say the score is as bad as possible AFAIU the code)"
I pushed this to let the PCL community decide if this patch is worth being merged or not.