SteveMacenski / slam_toolbox

Slam Toolbox for lifelong mapping and localization in potentially massive maps with ROS
GNU Lesser General Public License v2.1
1.68k stars 525 forks source link

Crash when loading existing map #668

Closed james-ward closed 9 months ago

james-ward commented 9 months ago

Required Info:

Steps to reproduce issue

  1. Specify map_file_name
  2. Configure lifecycle node
  3. Segfault

Expected behavior

Node starts after successfully loading graph

Actual behavior

[INFO] [1707806197.227857910] [slam_toolbox]: Configuring
[INFO] [1707806197.261066300] [slam_toolbox]: Using solver plugin solver_plugins::CeresSolver
[New Thread 0x7fffe8fff640 (LWP 3116266)]
[ERROR] [1707806197.295929473] [slam_toolbox]: LocalizationSlamToolbox: Incorrect number of arguments for map starting pose. Must be in format: [x, y, theta]. Starting at the origin
Load From File /home/james/zio/install/zio_maps/share/zio_maps/maps/chippendale.posegraph 
Mapper <- Module
Mapper <- m_pSequentialScanMatcher
Mapper <- m_pGraph
MapperGraph <- Graph; Graph <- m_Edges; Graph <- m_Vertices
MapperGraph <- m_pMapper; MapperGraph <- m_pLoopScanMatcher; MapperGraph <- m_pTraversal
Mapper <- m_pMapperSensorManager
MapperSensorManager <- m_ScanManagers; MapperSensorManager <- m_Scans
Mapper <- m_Listeners
**Finished serializing Mapper**
Load From File
**Serializing Dataset**
Dataset <- m_SensorNameLookup
Dataset <- m_Data
Dataset <- m_Lasers
Dataset <- m_pDatasetInfo
**Finished serializing Dataset**

Thread 1 "sync_slam_toolb" received signal SIGSEGV, Segmentation fault.
0x00007ffff7e10474 in loop_closure_assistant::LoopClosureAssistant::setMapper(karto::Mapper*) () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
(gdb) bt
#0  0x00007ffff7e10474 in loop_closure_assistant::LoopClosureAssistant::setMapper(karto::Mapper*) () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
#1  0x00007ffff7d62f7e in slam_toolbox::SlamToolbox::loadSerializedPoseGraph(std::unique_ptr<karto::Mapper, std::default_delete<karto::Mapper> >&, std::unique_ptr<karto::Dataset, std::default_delete<karto::Dataset> >&) () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
#2  0x00007ffff7d6ef95 in slam_toolbox::SlamToolbox::deserializePoseGraphCallback(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Request_<std::allocator<void> > >, std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Response_<std::allocator<void> > >) () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
#3  0x00007ffff7f7f3ef in slam_toolbox::SynchronousSlamToolbox::deserializePoseGraphCallback(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Request_<std::allocator<void> > >, std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Response_<std::allocator<void> > >) () from /home/james/zio/install/slam_toolbox/lib/libsync_slam_toolbox.so
#4  0x00007ffff7d5f8ae in slam_toolbox::SlamToolbox::loadPoseGraphByParams() () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
#5  0x00007ffff7d6844f in slam_toolbox::SlamToolbox::on_configure(rclcpp_lifecycle::State const&) () from /home/james/zio/install/slam_toolbox/lib/libtoolbox_common.so
#6  0x00007ffff7ee5c51 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::execute_callback(unsigned int, rclcpp_lifecycle::State const&) const ()
   from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#7  0x00007ffff7ee6052 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::change_state(unsigned char, rclcpp_lifecycle::node_interfaces::LifecycleNodeInterface::CallbackReturn&) ()
   from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#8  0x00007ffff7edbaa0 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::on_change_state(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >) () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#9  0x00007ffff7edfcb8 in std::_Function_handler<void (std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >), std::_Bind<void (rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::*(rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl*, std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >)> >::_M_invoke(std::_Any_data const&, std::shared_ptr<rmw_request_id_s>&&, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >&&, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >&&) () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#10 0x00007ffff7ee978b in ?? () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#11 0x00007ffff79cf9fa in ?? () from /opt/ros/rolling/lib/librclcpp.so
#12 0x00007ffff79080ad in rclcpp::Executor::execute_service(std::shared_ptr<rclcpp::ServiceBase>) () from /opt/ros/rolling/lib/librclcpp.so
#13 0x00007ffff7901ab6 in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /opt/ros/rolling/lib/librclcpp.so
#14 0x00007ffff7911bf0 in rclcpp::executors::SingleThreadedExecutor::spin() () from /opt/ros/rolling/lib/librclcpp.so
#15 0x00007ffff7909cc5 in rclcpp::spin(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>) () from /opt/ros/rolling/lib/librclcpp.so
#16 0x00005555555a0ef3 in main ()

Additional information

I suspect this is related to the refactor to make the nodes into lifecycle nodes. My guess is the problem is on https://github.com/SteveMacenski/slam_toolbox/blame/97796700adbd7d931f15ca25ef41d60068037c51/src/slam_toolbox_common.cpp#L979 because the segfault happens two lines later using the same object, but I haven't got any further than that.

SteveMacenski commented 9 months ago

Compile with debug flags (ex https://navigation.ros.org/tutorials/docs/get_backtrace.html) so we can see the specific lines of issue. I'd hazard to guess you're right about the lifecycle changes missed some subtle item for localization mode. Seeing where specifically is the problem in the trace would be my first step

I don't follow your particular guess at location, but feel free to make some changes and adjust. If you find a solution, open a PR!

[ERROR] [1707806197.295929473] [slam_toolbox]: LocalizationSlamToolbox: Incorrect number of arguments for map starting pose. Must be in format: [x, y, theta]. Starting at the origin

I don't think (probably) this is an issue, but just FYI in case you missed that line.

james-ward commented 9 months ago
#0  loop_closure_assistant::LoopClosureAssistant::setMapper (this=this@entry=0x0, mapper=0x555555a682c0) at /home/james/zio/src/slam_toolbox/src/loop_closure_assistant.cpp:85
#1  0x00007ffff7d62f7e in slam_toolbox::SlamToolbox::loadSerializedPoseGraph (this=0x5555556b8000, mapper=..., dataset=...) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_common.cpp:982
#2  0x00007ffff7d6ef95 in slam_toolbox::SlamToolbox::deserializePoseGraphCallback (this=this@entry=0x5555556b8000, request_header=std::shared_ptr<rmw_request_id_s> (empty) = {...}, 
    req=std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Request_<std::allocator<void> >> (use count 3, weak count 0) = {...}, 
    resp=std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Response_<std::allocator<void> >> (use count 3, weak count 0) = {...}) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_common.cpp:1048
#3  0x00007ffff7f7f3ef in slam_toolbox::SynchronousSlamToolbox::deserializePoseGraphCallback (this=this@entry=0x5555556b8000, request_header=std::shared_ptr<rmw_request_id_s> (empty) = {...}, 
    req=std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Request_<std::allocator<void> >> (use count 3, weak count 0) = {...}, 
    resp=std::shared_ptr<slam_toolbox::srv::DeserializePoseGraph_Response_<std::allocator<void> >> (use count 3, weak count 0) = {...}) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_sync.cpp:152
#4  0x00007ffff7d5f8ae in slam_toolbox::SlamToolbox::loadPoseGraphByParams (this=0x5555556b8000) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_common.cpp:548
#5  0x00007ffff7d6844f in slam_toolbox::SlamToolbox::on_configure (this=0x5555556b8000) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_common.cpp:134
#6  0x00007ffff7ee5c51 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::execute_callback(unsigned int, rclcpp_lifecycle::State const&) const ()
   from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#7  0x00007ffff7ee6052 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::change_state(unsigned char, rclcpp_lifecycle::node_interfaces::LifecycleNodeInterface::CallbackReturn&) ()
   from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#8  0x00007ffff7edbaa0 in rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::on_change_state(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >) () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#9  0x00007ffff7edfcb8 in std::_Function_handler<void (std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >), std::_Bind<void (rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl::*(rclcpp_lifecycle::LifecycleNode::LifecycleNodeInterfaceImpl*, std::_Placeholder<1>, std::_Placeholder<2>, std::_Placeholder<3>))(std::shared_ptr<rmw_request_id_s>, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >)> >::_M_invoke(std::_Any_data const&, std::shared_ptr<rmw_request_id_s>&&, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Request_<std::allocator<void> > >&&, std::shared_ptr<lifecycle_msgs::srv::ChangeState_Response_<std::allocator<void> > >&&) () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#10 0x00007ffff7ee978b in ?? () from /opt/ros/rolling/lib/librclcpp_lifecycle.so
#11 0x00007ffff79cf9fa in ?? () from /opt/ros/rolling/lib/librclcpp.so
#12 0x00007ffff79080ad in rclcpp::Executor::execute_service(std::shared_ptr<rclcpp::ServiceBase>) () from /opt/ros/rolling/lib/librclcpp.so
#13 0x00007ffff7901ab6 in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) () from /opt/ros/rolling/lib/librclcpp.so
#14 0x00007ffff7911bf0 in rclcpp::executors::SingleThreadedExecutor::spin() () from /opt/ros/rolling/lib/librclcpp.so
#15 0x00007ffff7909cc5 in rclcpp::spin(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>) () from /opt/ros/rolling/lib/librclcpp.so
#16 0x00005555555a0ef3 in main (argc=<optimised out>, argv=<optimised out>) at /home/james/zio/src/slam_toolbox/src/slam_toolbox_sync_node.cpp:28
james-ward commented 9 months ago

[ERROR] [1707806197.295929473] [slam_toolbox]: LocalizationSlamToolbox: Incorrect number of arguments for map starting pose. Must be in format: [x, y, theta]. Starting at the origin

I don't think (probably) this is an issue, but just FYI in case you missed that line.

I did notice that. I also think it is not related, after taking a brief look. I think the logic to generate that message might need revisiting.

james-ward commented 9 months ago
#0  loop_closure_assistant::LoopClosureAssistant::setMapper (this=this@entry=0x0, mapper=0x555555a682c0) at /home/james/zio/src/slam_toolbox/src/loop_closure_assistant.cpp:85

I believe this means the LoopClosureAssistant object is invalid somehow.

james-ward commented 9 months ago

The loading of a saved map is done in on_configure and the creation of the LoopClosureAssistant occurs in on_activate

The logic seems sound to me - try to load a map in the configure step. But the LCA can't be initialised at configure or it throws errors about getting the subscription count on publishers that aren't set up yet.

SteveMacenski commented 9 months ago

You never mentioned what node you're using.

Are you just immediately calling the deserialize callback?

I did notice

james-ward commented 9 months ago

PR incoming... just building and testing on our platform...