MIT-SPARK / Kimera-RPGO

Robust Pose Graph Optimization
BSD 2-Clause "Simplified" License
475 stars 133 forks source link

RobustPGO Teardown #22

Closed marcusabate closed 5 years ago

marcusabate commented 5 years ago

While running a unit test from Spark VIO I have a strange issue. It could be a problem in gtests, or in VIO, or here. But the issue popped up with RobustPGO so I'll put it here for now.

When I run the full test executable everything is fine. But if I just run a series of tests that have the PGO stuff (or even just the loop closure detector tests) then I a segfault. Running with gdb shows this to be the cause:

Thread 1 "testSparkVio" received signal SIGSEGV, Segmentation fault.
0x0000555555a505ec in std::default_delete<RobustPGO::OutlierRemoval>::operator() (this=0x5555563697c0, 
    __ptr=0x555556000000)
    at /usr/include/c++/7/bits/unique_ptr.h:78
78      delete __ptr;

So it looks like there is some problem with the teardown of RobustPGO class members. Again this could be a problem with the way I set them up in VIO or even just the test suite. Looking into it today, but @yunzc if you have any thoughts they'd be appreciated.

marcusabate commented 5 years ago

As an update, it seems to be an issue with RobustPGO somewhere. The problem is only related to the std::unique_ptr<OutlierRemoval> outlier_removal_; class member of RobustSolver. This is initialized based on parameter selection in the RobustSolver constructor. One of the options is NONE, which calls outlier_removal_ = nullptr;. This concerns me a bit but looking through RobustPGO I see that every time outlier_removal_ is used it is first checked so that cases in which nullptr is the selection aren't used.

In VIO I have a class member of type RobustPGO:

std::unique_ptr<RobustPGO::RobustSolver> pgo_;

Which is initialized in the constructor like so:

RobustPGO::RobustSolverParams pgo_params;
  pgo_params.setPcmSimple3DParams(lcd_params_.pgo_trans_threshold_,
      lcd_params_.pgo_rot_threshold_, RobustPGO::Verbosity::VERBOSE);
pgo_ = VIO::make_unique<RobustPGO::RobustSolver>(pgo_params);

Inside of testLoopClosureDetector.cpp I simply have the class member:

std::unique_ptr<LoopClosureDetector> lcd_detector_;

Which is initialized in the fixture like so:

LoopClosureDetectorParams params;
params.parseYAML(lcd_FLAGS_test_data_path_+"/testLCDParameters.yaml");

lcd_detector_ = VIO::make_unique<LoopClosureDetector>(params, false);

Not a lot to go wrong there.

I've added a destructor to GenericSolver.h:

virtual ~GenericSolver() = default;

And a destructor to RobustSolver.h:

~RobustSolver() = default;

And I've added the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro as a public member of the pcm.h class since it has a bunch of gtsam::Matrix members which are typedefs of Eigen::Matrix.

None of these changes have solved the original problem, which still pops up when I run LoopClosureDetector tests only but not when I run the entire Spark VIO testing suite.

@yunzc I'm pretty stumped here so any thoughts you have on why unique_ptrs in RobustPGO would be misbehaving during tear down at the end of runtime would be appreciated.

yunzc commented 5 years ago

Fixed by getting rid of c++14 dependency and defining make_unique in RobustPGO

marcusabate commented 5 years ago

Reopening because the problem is not yet solved it seems. I still get the same segfault in VIO sometimes.

Thread 1 "testSparkVio" received signal SIGSEGV, Segmentation fault.
std::unique_ptr<RobustPGO::OutlierRemoval, std::default_delete<RobustPGO::OutlierRemoval> >::~unique_ptr (
    this=0x55556515d200, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
268       get_deleter()(__ptr);
(gdb) backtrace
#0  0x00007ffff792c1ad in std::unique_ptr<RobustPGO::OutlierRemoval, std::default_delete<RobustPGO::OutlierRemoval> >::~unique_ptr() (this=0x55556515d200, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#1  0x00007ffff792c1ad in RobustPGO::RobustSolver::~RobustSolver() (this=0x55556515d170, __in_chrg=<optimized out>)
    at /home/marcus/code/RobustPGO/RobustPGO/RobustSolver.h:40
#2  0x00007ffff792c1ad in std::default_delete<RobustPGO::RobustSolver>::operator()(RobustPGO::RobustSolver*) const (this=<optimized out>, __ptr=0x55556515d170) at /usr/include/c++/7/bits/unique_ptr.h:78
#3  0x00007ffff792c1ad in std::unique_ptr<RobustPGO::RobustSolver, std::default_delete<RobustPGO::RobustSolver> >::~unique_ptr() (this=0x5555559bd0f0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#4  0x00007ffff792c1ad in VIO::LoopClosureDetector::~LoopClosureDetector() (this=0x5555559bcee0, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/src/LoopClosureDetector.cpp:107
#5  0x00007ffff792c4f9 in VIO::LoopClosureDetector::~LoopClosureDetector() (this=0x5555559bcee0, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/src/LoopClosureDetector.cpp:109
#6  0x00005555556206e2 in std::default_delete<VIO::LoopClosureDetector>::operator()(VIO::LoopClosureDetector*) const (this=<optimized out>, __ptr=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#7  0x00005555556206e2 in std::unique_ptr<VIO::LoopClosureDetector, std::default_delete<VIO::LoopClosureDetector> >::~unique_ptr() (this=0x5555559bc3c0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#8  0x00005555556206e2 in LCDFixture::~LCDFixture() (this=0x5555559bc390, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/tests/testLoopClosureDetector.cpp:41

Basically this shows that it is the fault of the OutlierRemoval class. Looking at that, I noticed the Base class has no constructor. The derived class Pcm calls this nonexistent constructor. It also does not initialize all data members in its own constructor. I think I'll start with these things if @yunzc you do not see any reason why they should stay that way or where else the segfault could be coming from.