rpng / MINS

An efficient and robust multisensor-aided inertial navigation system with online calibration that is capable of fusing IMU, camera, LiDAR, GPS/GNSS, and wheel sensors. Use cases: VINS/VIO, GPS-INS, LINS/LIO, multi-sensor fusion for localization and mapping (SLAM). This repository also provides multi-sensor simulation and data.
GNU General Public License v3.0
485 stars 80 forks source link

How MINS dealwith multi thread problem? #31

Closed jiachenglihxl closed 3 weeks ago

jiachenglihxl commented 5 months ago

In the code, each sensor callback will use function "try to update" to update system state. And they all use StateHelper::EKFUpdate in the end. My question is, what if two sensor come together, and call EKFUpdate at the same time? Will this leads to coredump or memory corruption?

aidadgy commented 1 month ago

Hello, When I ran roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true, the following problems occurred: ========Final Status======== Total procesing time: 21s Total traveling time: 48s

RMSE average: 0.257, 0.117 (deg,m) NEES average: 7.133, 2.380 (deg,m) simulation: /usr/include/boost/thread/pthread/recursive_mutex.hpp:108: void boost::recursive_mutex::lock(): Assertion `!posix::pthread_mutex_lock(&m)' failed. ================================================================================REQUIRED process [mins_simulation0-2] has died! process has died [pid 1327580, exit code -6, cmd /home/ubantu/catkin_ws/devel/lib/mins/simulation __name:=mins_simulation0 __log:=/home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2.log]. log file: /home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2*.log Initiating shutdown! [mins_simulation0-2] killing on exit [rosout-1] killing on exit [master] killing on exit shutting down processing monitor... ... shutting down processing monitor complete done I wonder if the author can give some suggestions,thanks

WoosikLee2510 commented 1 month ago

In the code, each sensor callback will use function "try to update" to update system state. And they all use StateHelper::EKFUpdate in the end. My question is, what if two sensor come together, and call EKFUpdate at the same time? Will this leads to coredump or memory corruption?

Hi, @jiachenglihxl. I'm sorry for my (extremely) late reply. The current implementation is not truly multi-threaded. This means that even if two different sensor measurements come to the system at the same time, whichever calls the callback function fist will be processed first, and the other one will wait until it can be processed.

WoosikLee2510 commented 1 month ago

Hello, When I ran roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true, the following problems occurred: ========Final Status======== Total procesing time: 21s Total traveling time: 48s

RMSE average: 0.257, 0.117 (deg,m) NEES average: 7.133, 2.380 (deg,m) simulation: /usr/include/boost/thread/pthread/recursive_mutex.hpp:108: void boost::recursive_mutex::lock(): Assertion `!posix::pthread_mutex_lock(&m)' failed. ================================================================================REQUIRED process [mins_simulation0-2] has died! process has died [pid 1327580, exit code -6, cmd /home/ubantu/catkin_ws/devel/lib/mins/simulation __name:=mins_simulation0 __log:=/home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2.log]. log file: /home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2*.log Initiating shutdown! [mins_simulation0-2] killing on exit [rosout-1] killing on exit [master] killing on exit shutting down processing monitor... ... shutting down processing monitor complete done I wonder if the author can give some suggestions,thanks

Hi, @aidadgy. I suspect the error comes from LiDAR, the only place I use multi-thread for fast parallel computation. However, I cannot produce your error on my end, so it is hard to debug where the error is coming from. One thing I noticed is your Total traveling time: 48s. If you ran the default simulation, the total travel time should be 48 s, indicating that you may get your error while terminating the system at the end of the run. Could you provide more information about your problem? It will be helpful to debug the issue.

aidadgy commented 1 month ago

Hello, When I ran roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true, the following problems occurred: ========Final Status======== Total procesing time: 21s Total traveling time: 48s RMSE average: 0.257, 0.117 (deg,m) NEES average: 7.133, 2.380 (deg,m) simulation: /usr/include/boost/thread/pthread/recursive_mutex.hpp:108: void boost::recursive_mutex::lock(): Assertion `!posix::pthread_mutex_lock(&m)' failed. ================================================================================REQUIRED process [mins_simulation0-2] has died! process has died [pid 1327580, exit code -6, cmd /home/ubantu/catkin_ws/devel/lib/mins/simulation __name:=mins_simulation0 __log:=/home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2.log]. log file: /home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2*.log Initiating shutdown! [mins_simulation0-2] killing on exit [rosout-1] killing on exit [master] killing on exit shutting down processing monitor... ... shutting down processing monitor complete done I wonder if the author can give some suggestions,thanks

Hi, @aidadgy. I suspect the error comes from LiDAR, the only place I use multi-thread for fast parallel computation. However, I cannot produce your error on my end, so it is hard to debug where the error is coming from. One thing I noticed is your Total traveling time: 48s. If you ran the default simulation, the total travel time should be 48 s, indicating that you may get your error while terminating the system at the end of the run. Could you provide more information about your problem? It will be helpful to debug the issue.

Dear author, I am glad to receive your reply and thank you for it. I put a screenshot of the last run terminating below as well, and I noticed when I was runningroslaunch mins rosbag.launch config:=kaist/kaist_LC path_gt:=urban30.txt path_bag:=urban30.bag, have the same termination problem。 2024-10-13 18-35-40屏幕截图 I think the source code should be fine, but I also noticed that when I compiled the code with catkin build, a lot of warnings appeared, as follows:The first warning is related to Anaconda library, and the second one seems to be related to lidar. I don't know whether these warnings are related to program termination. I look forward to your reply, thank you. 2024-10-15 19-46-38屏幕截图 2024-10-15 19-51-14屏幕截图

aidadgy commented 1 month ago

Hello, When I ran roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true, the following problems occurred: ========Final Status======== Total procesing time: 21s Total traveling time: 48s RMSE average: 0.257, 0.117 (deg,m) NEES average: 7.133, 2.380 (deg,m) simulation: /usr/include/boost/thread/pthread/recursive_mutex.hpp:108: void boost::recursive_mutex::lock(): Assertion `!posix::pthread_mutex_lock(&m)' failed. ================================================================================REQUIRED process [mins_simulation0-2] has died! process has died [pid 1327580, exit code -6, cmd /home/ubantu/catkin_ws/devel/lib/mins/simulation __name:=mins_simulation0 __log:=/home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2.log]. log file: /home/ubantu/.ros/log/d0e201e6-885e-11ef-8f8b-41555d88daad/mins_simulation0-2*.log Initiating shutdown! [mins_simulation0-2] killing on exit [rosout-1] killing on exit [master] killing on exit shutting down processing monitor... ... shutting down processing monitor complete done I wonder if the author can give some suggestions,thanks

Hi, @aidadgy. I suspect the error comes from LiDAR, the only place I use multi-thread for fast parallel computation. However, I cannot produce your error on my end, so it is hard to debug where the error is coming from. One thing I noticed is your Total traveling time: 48s. If you ran the default simulation, the total travel time should be 48 s, indicating that you may get your error while terminating the system at the end of the run. Could you provide more information about your problem? It will be helpful to debug the issue. When I run roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true,Result display:process has finished lwanly。

aidadgy commented 1 month ago

When I run roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=false, the same problem occurs when I change lidar to false,but When I run roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true,Result display:process has finished lwanly,I suspect that the imu time stamp and the camera time stamp do not align, do you think this is the problem In addition, I have a problem like you mentioned, when I set the parameter to debug in simulation.launch file, some messages are output as follows: 2024-10-25 21-59-08屏幕截图

WoosikLee2510 commented 1 month ago

When I run roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=false, the same problem occurs when I change lidar to false,but When I run roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true,Result display:process has finished lwanly,I suspect that the imu time stamp and the camera time stamp do not align, do you think this is the problem In addition, I have a problem like you mentioned, when I set the parameter to debug in simulation.launch file, some messages are output as follows: 2024-10-25 21-59-08屏幕截图

Okay, there seem to be many issues when you use MINS. First, I am sorry about that. Let me try to answer each of your concerns.

You have been tested: roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true // using cam, lidar roslaunch mins rosbag.launch config:=kaist/kaist_LC path_gt:=urban30.txt path_bag:=urban30.bag // using cam, lidar roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=false // using cam roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true // using lidar And I think you said MINS didn't crash when you disabled the camera (third try). If that's the case, I suspect OpenCV is the cause of the problem. Basically, when extracting features from the image, OpenCV may use multi-thread, which is actually disabled in the lunch file <env name="OMP_NUM_THREADS" value="1" />. It should not use multi-thread as the number of threads is set to 1, but maybe such settings won't go well with your platform. Could you try playing with this variable?

Regarding your compile warnings about Anaconda, I think you may have some version conflict issues. I tested a clean install of Ubuntu/Ros and MINS and didn't get such many warnings. Do you have any other device to clean install MINS? Also, ikd tree can give you many warnings, but you can ignore them.

aidadgy commented 1 month ago

When I run roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=false, the same problem occurs when I change lidar to false,but When I run roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true,Result display:process has finished lwanly,I suspect that the imu time stamp and the camera time stamp do not align, do you think this is the problem In addition, I have a problem like you mentioned, when I set the parameter to debug in simulation.launch file, some messages are output as follows: 2024-10-25 21-59-08屏幕截图

Okay, there seem to be many issues when you use MINS. First, I am sorry about that. Let me try to answer each of your concerns.

You have been tested: roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=true // using cam, lidar roslaunch mins rosbag.launch config:=kaist/kaist_LC path_gt:=urban30.txt path_bag:=urban30.bag // using cam, lidar roslaunch mins simulation.launch cam_enabled:=true lidar_enabled:=false // using cam roslaunch mins simulation.launch cam_enabled:=false lidar_enabled:=true // using lidar And I think you said MINS didn't crash when you disabled the camera (third try). If that's the case, I suspect OpenCV is the cause of the problem. Basically, when extracting features from the image, OpenCV may use multi-thread, which is actually disabled in the lunch file <env name="OMP_NUM_THREADS" value="1" />. It should not use multi-thread as the number of threads is set to 1, but maybe such settings won't go well with your platform. Could you try playing with this variable?

Regarding your compile warnings about Anaconda, I think you may have some version conflict issues. I tested a clean install of Ubuntu/Ros and MINS and didn't get such many warnings. Do you have any other device to clean install MINS? Also, ikd tree can give you many warnings, but you can ignore them. Thank you for your answer,when I set the parameter to debug in simulation.launch file, some messages are output as follows,what do you think about that: image In addition, my ubuntu is 22.04, I installed ros1, is there a certain relationship with this,thanks

WoosikLee2510 commented 3 weeks ago

Re: Email discussion - It was due to an unmatched version of ROS and the operating system. After reconfiguring the environment in the computer, the problem was resolved.