MIT-SPARK / TEASER-plusplus

A fast and robust point cloud registration library
MIT License
1.69k stars 333 forks source link

[QUESTION] Why does the teaser-plusplus FPFH example yield this kind of results? #147

Closed pahoffmann closed 1 month ago

pahoffmann commented 1 year ago

Have you read the documentation?

Post your theoretical questions / usage questions here.

Some background to my question: I am currently investigating loop closure's in a graph based SLAM and need to do a scan-matching between two laserscans. I want to use teaser-plusplus to produce a good initial estimate between the scans and apply a post-registration with ICP afterwards, as suggested.

To get an initial hang of it, i took a look at the teaser_cpp_fpfh example, which this question is all about: When building and executing the example, i get the following output for the registration result: (notice i added a method to make the expected and estimated transform more human readable):

===================================== TEASER++ Results

Expected rotation: 0.996927 0.0668736 -0.0406664 -0.066129 0.997618 0.0194009 0.0418676 -0.0166518 0.998978 Pos: -0.12 | -0.04 | 0.11 Angles: 3.12 | -3.10 | 3.07 Estimated rotation: 0.76 -0.08 -0.64 0.28 -0.86 0.44 -0.59 -0.51 -0.63 Error (deg): 2.62 Pos: -0.10 | 0.11 | 0.14 Angles: 0.61 | -2.44 | -3.04

Expected translation: -0.12 -0.04 0.11 Estimated translation: -0.10 0.11 0.14 Error (m): 0.15

Number of correspondences: 1889 Number of outliers: 1700 Time taken (s): 0.00

As you can clearly see, teaser-plusplus does not even converge in the given example, but terminates early on, because GNC-TLS terminated because maximum residual at initialization is very small.. I am not sure if this is expected behavior, but the algorithm still proceeds to calculate the final transformation, which is also completely false (The cloud gets rotated by an average of 3 degrees and the resulting error in the rotation is also almost at 3 degrees - so there is an error of almost 100%). I am pretty sure, that this is not the expected behavior for this test case and might need some looking into. If i am completely wrong, please enlighten me! :)

Edit: I just realized, that the number of correspondences displayed by the output of the algorithm (1889) is not actually the number of correspondences, but the size of the input cloud :smile: The actual number of correspondences found by the algorithm is 3 :open_mouth: With the actual number of correspondences i mean the the output of the matcher.calculateCorrespondences() function which gets src and target cloud and its respective descriptors as it's input.

Cheers

jingnanshi commented 1 year ago

@pahoffmann Which branch did you run the code on?

pahoffmann commented 1 year ago

On the master branch! should i use the develop instead?

jingnanshi commented 1 year ago

Please try the develop again. A while ago I remember I updated a few things regarding FPFH on develop. I'm also looking into this right now. Also, please run ctest to run all unit tests to ensure everything passes (on develop). Thanks!

pahoffmann commented 1 year ago

On it!

pahoffmann commented 1 year ago

@jingnanshi I just completely rebuild based on the develop and actually a lot of unit tests fail:

The following tests FAILED: 15 - RegistrationTest.LargeModel (SEGFAULT) 16 - RegistrationTest.LargeModelSingleThreaded (SEGFAULT) 17 - RegistrationTest.SolveForScale (SEGFAULT) 18 - RegistrationTest.SolveForRotation (SEGFAULT) 19 - RegistrationTest.SolveRegistrationProblemDecoupled (Child aborted) 20 - RegistrationTest.OutlierDetection (SEGFAULT) 21 - RegistrationTest.NoMaxClique (Child aborted) 22 - RegistrationTest.CliqueFinderModes (Child aborted) 46 - RegistrationBenchmark.Benchmark1 (Child aborted) 47 - RegistrationBenchmark.Benchmark2 (Child aborted) 48 - RegistrationBenchmark.Benchmark3 (Child aborted) 49 - RegistrationBenchmark.Benchmark4 (Child aborted) 50 - RegistrationBenchmark.Benchmark5 (Child aborted) 51 - RegistrationBenchmark.Benchmark6 (SEGFAULT)

Mostly due to segfaults.

I also executed the FPFH example which yields worse results than the one master branch:

Expected rotation: 0.996927 0.0668736 -0.0406664 -0.066129 0.997618 0.0194009 0.0418676 -0.0166518 0.998978 Estimated rotation: 0.207406 -0.544615 0.812636 -0.197371 -0.836913 -0.510511 0.958138 -0.0545075 -0.281072 Error (deg): 2.88443

Expected translation: -0.115577 -0.0387705 0.114875 Estimated translation: -0.104336 0.0740991 0.20408 Error (m): 0.144304

pahoffmann commented 1 year ago

On master, all the tests pass without any problem.

jingnanshi commented 1 year ago

On master, all the tests pass without any problem.

Can you check what commit the PMC library downloaded by CMake is? Also, try run ctest with 'OMP_NUM_THREADS=12 ctest'

pahoffmann commented 1 year ago

The PMC Library (pmc-src) is up to date with It's master branch. Running the tests with above commands yields the same errors.

jingnanshi commented 1 year ago

Updates on this issue: I can reproduce the issue with the FPFH example. I suspect the issue was introduced when I changed the FPFH class to use the PCL's multi-threaded implementation. teaser_cpp_ply example works fine so the problem is limited in the feature matcher.

jingnanshi commented 1 year ago

I cannot seem to reproduce the seg faults in the unit tests, however.

pahoffmann commented 1 year ago

Thank you so much for looking into this!

I am gonna look into the feature matching today.

pahoffmann commented 1 year ago

Those segfaults are giving me an headache. What key differences are there between master and develop? Just overall more parallelized? Or actually some extensive changes in the algorithms?

jingnanshi commented 1 year ago

Those segfaults are giving me an headache. What key differences are there between master and develop? Just overall more parallelized? Or actually some extensive changes in the algorithms?

There are no changes in the algorithms that affect parallelism. There was an issue in the PMC library that causes seg faults when you use more than 12 threads, however I fixed it in my fork two months ago (which is why I asked you to check the PMC repo's commit).

My suspicion is that something in PMC is causing these seg faults, as the unit tests that failed seem to be using the maximum clique finder. Valgrind will be a good way to isolate the code that causes these seg faults.

pahoffmann commented 1 year ago

I actually reinstalled the whole thing today and ran the tests on develop. Now all of the unit tests pass without any issues.

jingnanshi commented 1 year ago

@pahoffmann If you are still working on this, do you mind try this branch's FPFH example: https://github.com/MIT-SPARK/TEASER-plusplus/tree/bugfix/examples ? Thanks!

TongxingJin commented 3 months ago

@pahoffmann If you are still working on this, do you mind try this branch's FPFH example: https://github.com/MIT-SPARK/TEASER-plusplus/tree/bugfix/examples ? Thanks!

I solved the segment fault by keeping BUILD_WITH_MARCH_NATIVE false.