MIT-SPARK / Kimera-VIO

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.
BSD 2-Clause "Simplified" License
1.53k stars 416 forks source link

Loop closure detector related segmentation fault #13

Closed DreamWaterFound closed 4 years ago

DreamWaterFound commented 4 years ago

Description:

When I running "stereoVIOEuroc.bash", everything is OK. After a while, a segmentation fault occured when vio-system processing the frame in sequence "Euroc/V1_01_easy" and "Euroc/V2_01_easy". (Other sequences have not been tested)

I thought it's related the Euroc-datasets I used, but the problem remains after I used your version of Euroc-datasets. Then I checked the information in the core file and found that this problem is related to the thread of loop closure detector (See "Key messages in core file" for details).

Well, I have changed the parameter --use_lcd and --parallel_run (i.e., parameter USE_LCD and PARALLEL_RUN in "stereoVIOEuroc.bash") and some things looks different:

--use_lcd --parallel_run vio-system result
0 0 Normal termination.
0 1 Segmentation fault.
1 0 Normal Termination but errors received during running.*
1 1 Normal Termination but errors received during running.*

* Error message: "LoopClosureDetector: No output callback registered. Either register a callback or disable LCD with flag use_lcd=false."

I have no idea about it. It will take me a long time to read and analyse source codes ... Could you please help me to solve this problem? I would appreciate it. O(≧▽≦)O

Command:

${MY_CODE_DIR}/Kimera/comments/Kimera-VIO/scripts$ ./stereoVIOEuroc.bash

Parameters in stereoVIOEuroc.bash:

DATASET_PATH="${My_Datasets_DIR}/Kimera_EuRoc/V1_01_easy"

# Specify: 1 to use Regular VIO, 0 to use Normal VIO with default parameters.
USE_REGULAR_VIO=0

# Specify: 0 to run on EuRoC data, 1 to run on Kitti
DATASET_TYPE=0

# Specify: 1 to run pipeline in parallel mode, 0 to run sequentially.
PARALLEL_RUN=1

# Specify: 1 to enable the LoopClosureDetector, 0 to not.
USE_LCD=0

# Specify: 1 to enable logging of output files, 0 to not.
LOG_OUTPUT=1

Console output:

...
------------------- Processing frame k = 1999--------------------
VIO: adding keyframe 487 at timestamp:1.40372e+09 (nsec).
Current IMU Preintegration frequency: 192308 Hz. (52 us).
VIO: adding between 
[0.0542361722, 0.102485067, 0.0950404991]';R:
[
            0.999910075    -0.013402216 -0.000472217364;
           0.0133991652     0.999892464  -0.00596017296;
         0.000552046109   0.00595330967     0.999982127
  ]
Current Stereo FrontEnd frequency: 166.667 Hz. (6 ms).
Current Visualizer frequency: 27.027 Hz. (37 ms).
Hessian stats: ===========
rows: 390
nrElementsInMatrix_: 152100
nrZeroElementsInMatrix_: 120294
Backend: Update IMU Bias.
Current Backend frequency: 15.3846 Hz. (65 ms).
createMesh2D - error, keypoint out of image frame.
createMesh2D - error, keypoint out of image frame.
createMesh2D - error, keypoint out of image frame.
Current Mesher frequency: 90.9091 Hz. (11 ms).
Current Visualizer frequency: 35.7143 Hz. (28 ms).
./stereoVIOEuroc.bash: line 123: 23505 Segmentation fault      (core dumped) ../build/stereoVIOEuroc --flagfile="../params/flags/stereoVIOEuroc.flags" --flagfile="../params/flags/Mesher.flags" --flagfile="../params/flags/VioBackEnd.flags" --flagfile="../params/flags/RegularVioBackEnd.flags" --flagfile="../params/flags/Visualizer3D.flags" --flagfile="../params/flags/EthParser.flags" --logtostderr=1 --colorlogtostderr=1 --log_prefix=0 --dataset_path="$DATASET_PATH" --vio_params_path="$VIO_PARAMS_PATH" --initial_k=50 --final_k=2000 --tracker_params_path="$TRACKER_PARAMS_PATH" --lcd_params_path="$LCD_PARAMS_PATH" --vocabulary_path="../vocabulary/ORBvoc.yml" --use_lcd="$USE_LCD" --v=0 --vmodule=VioBackEnd=0,RegularVioBackEnd=0,Mesher=0,StereoVisionFrontEnd=0 --backend_type="$BACKEND_TYPE" --parallel_run="$PARALLEL_RUN" --dataset_type="$DATASET_TYPE" --log_output="$LOG_OUTPUT" --output_path="../output_logs/"

Key messages in core file:

...
[New LWP 22455]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `../build/stereoVIOEuroc --flagfile=../params/flags/stereoVIOEuroc.flags --flagf'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f044e642b75 in std::__atomic_base<bool>::load (__m=std::memory_order_seq_cst, this=<optimized out>) at /usr/include/c++/7/bits/atomic_base.h:396
396             return __atomic_load_n(&_M_i, __m);
[Current thread is 1 (Thread 0x7f03ebfff700 (LWP 22447))]
(gdb) up
#1  std::atomic<bool>::operator bool (this=<optimized out>) at /usr/include/c++/7/atomic:86
86          { return _M_base.load(); }
(gdb) up
#2  VIO::LoopClosureDetector::isWorking (this=<optimized out>) at /home/guoqing/SLAM/Kimera/comments/Kimera-VIO/include/kimera-vio/loopclosure/LoopClosureDetector.h:153
153       inline bool isWorking() const { return is_thread_working_; }
(gdb) up
#3  VIO::Pipeline::shutdownWhenFinished (this=0x7fffc13b36e0) at /home/guoqing/SLAM/Kimera/comments/Kimera-VIO/src/pipeline/Pipeline.cpp:561
561                 !loop_closure_detector_->isWorking() &&
(gdb) 

Additional messages

I have one test failed during make test:

...
99% tests passed, 1 tests failed out of 165

Total Test time (real) = 110.28 sec

The following tests did not run:
          4 - testCameraParams.parseKITTICalib (Disabled)
          9 - FeatureSelector.createOmegaBarImu (Disabled)
         10 - FeatureSelector.GetVersorIfInFOV (Disabled)
         32 - testFrame.CalibratePixel (Disabled)
         33 - testFrame.CalibratePixel (Disabled)
         94 - StereoFrameFixture.undistortFisheyeStereoFrame (Disabled)
        150 - UtilsOpenCVFixture.ExtractCornersChessboard (Disabled)
        163 - UtilsOpenCVFixture.ExtractCornersChessboard (Disabled)
        173 - OnlineAlignmentFixture.GyroscopeBiasEstimation (Disabled)
        174 - OnlineAlignmentFixture.GyroscopeBiasEstimationAHRS (Disabled)
        176 - OnlineAlignmentFixture.OnlineGravityAlignment (Disabled)
        177 - OnlineAlignmentFixture.GravityAlignmentRealData (Disabled)

The following tests FAILED:
        162 - UtilsOpenCVFixture.ReadAndConvertToGrayScale (Failed)
Errors while running CTest
Makefile:129: recipe for target 'test' failed
make: *** [test] Error 8

Furthermore, the Triangular mesh displayed by 3-D Visualizer colored pure white, which is different to your demo GIFs:

Peek 2019-10-23 10-38

Is this OK? I don't know if it's related to the above "segmentation fault" problem.

Additional files: These log files were generated before segmentation fault happened, with USE_LCD = 0 and PARALLEL_RUN = 1.

output_logs.zip

Environment

ToniRV commented 4 years ago

Hi @DreamWaterFound , Thank you for raising the issue. 1) Loop closure seg fault: Note that if you use OpenCV's 3D visualization (the one you are showing in the gif), and run in parallel mode, there is a known bug in OpenCV which doesn't free memory appropriately and kills the VIO pipeline in a bad way (seg fault). This will only happen when the pipeline has finished processing. By default, the final_k parameter is set to 2000 frames: aka the Euroc dataset will be run up to its 2000th frame (you can check for this param in the script you run). I see your log output saying that it reached frame 1999. So probably you are seeing this issue. The issue doesn't appear if you run in sequential mode I believe. 2) The msg: createMesh2D - error, keypoint out of image frame. should be silent. It is not important and you can safely ignore it. 3) The mesh is colored white by default, but I switched the default to be the one in the gif. I'll merge this asap. 4) The failing test ReadAndConvertToGrayScale is indeed a bit weird since you are using a supported opencv version. I have opened a ticket on our side to make sure we fix that.

marcusabate commented 4 years ago

In addition, note that the "LoopClosureDetector: No output callback registered. Either register a callback or disable LCD with flag use_lcd=false." message should be a warning and is not indicative of any serious error in the pipeline. I'll open a PR to change the status of this message to a warning.

marcusabate commented 4 years ago

Hi @DreamWaterFound, we've fixed this problem in 566e43c5a9195035a60b651890a34264cb3690c0. Let us know if it still happens.