ros2 / rmw_fastrtps

Implementation of the ROS Middleware (rmw) Interface using eProsima's Fast RTPS.
Apache License 2.0
147 stars 116 forks source link

Asan regression between Humble and Iron release #760

Open luca-della-vedova opened 1 month ago

luca-della-vedova commented 1 month ago

Bug report

Required Info:

Steps to reproduce issue

The regression happened in our automated CI. Specifically, looking at the result of this action, it is green on Humble, and fails on Iron.

To reproduce, either fork the rmf_ci_templates repo to run in a github action (note that we set the rmf to cyclonedds to circumvent this issue, this should be reverted) or reproduce locally by:

Expected behavior

CI succeeds on both iron and humble.

Actual behavior

CI fails on iron. I haven't tried but I suspect that building the same branches on humble would actually work.

Additional information

The error reported by asan is reported in the PR but I'll paste it here in case it becomes unavailable:

2024-05-24T20:17:09.6645509Z     cannot publish data, at ./src/rmw_publish.cpp:62 during '__function__'
2024-05-24T20:17:09.6646646Z     Fail in delete datareader, at ./src/rmw_service.cpp:100 during '__function__'
2024-05-24T20:17:09.6647476Z     AddressSanitizer:DEADLYSIGNAL
2024-05-24T20:17:09.6648038Z     =================================================================
2024-05-24T20:17:09.6649235Z     ==44942==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000007f0 (pc 0x7ff6c375bef4 bp 0x60c0006b2dc0 sp 0x7ffff95c7048 T0)
2024-05-24T20:17:09.6650494Z     ==44942==The signal is caused by a READ memory access.
2024-05-24T20:17:09.6651220Z     ==44942==Hint: address points to the zero page.
2024-05-24T20:17:09.6651916Z     ==44942==WARNING: invalid path to external symbolizer!
2024-05-24T20:17:09.6652730Z     ==44942==WARNING: Failed to use and restart external symbolizer!
2024-05-24T20:17:09.6654030Z         #0 0x7ff6c375bef4  (/lib/x86_64-linux-gnu/libc.so.6+0x97ef4) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)
2024-05-24T20:17:09.6655747Z         #1 0x7ff6be2ef20d  (/opt/ros/iron/lib/libfastrtps.so.2.10+0x34820d) (BuildId: a17d15658847c53e8609260c3308529bf5ffa3c1)
2024-05-24T20:17:09.6657530Z         #2 0x7ff6be2f1735  (/opt/ros/iron/lib/libfastrtps.so.2.10+0x34a735) (BuildId: a17d15658847c53e8609260c3308529bf5ffa3c1)
2024-05-24T20:17:09.6659410Z         #3 0x7ff6c19ca8ed  (/opt/ros/iron/lib/librmw_fastrtps_shared_cpp.so+0x4e8ed) (BuildId: 26ffed4ec6855b417c93e7ef8977e33baa96787f)
2024-05-24T20:17:09.6661211Z         #4 0x7ff6c5a3a89b  (/opt/ros/iron/lib/librcl.so+0x2b89b) (BuildId: 06fa6dca7f44b5a0082924dbc102de5ef1754bc0)
2024-05-24T20:17:09.6662804Z         #5 0x7ff6c5a3ac52  (/opt/ros/iron/lib/librcl.so+0x2bc52) (BuildId: 06fa6dca7f44b5a0082924dbc102de5ef1754bc0)
2024-05-24T20:17:09.6664317Z         #6 0x7ff6c5c921f3  (/opt/ros/iron/lib/librclcpp.so+0x14c1f3) (BuildId: 30853bd50c7eab48c22306ef21235a4f7e60180e)
2024-05-24T20:17:09.6665933Z         #7 0x7ff6c5c734e7  (/opt/ros/iron/lib/librclcpp.so+0x12d4e7) (BuildId: 30853bd50c7eab48c22306ef21235a4f7e60180e)
2024-05-24T20:17:09.6667349Z         #8 0x7ff6c5c71fe6  (/opt/ros/iron/lib/librclcpp.so+0x12bfe6) (BuildId: 30853bd50c7eab48c22306ef21235a4f7e60180e)
2024-05-24T20:17:09.6668987Z         #9 0x7ff6c3709a55  (/lib/x86_64-linux-gnu/libc.so.6+0x45a55) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)
2024-05-24T20:17:09.6670588Z         #10 0x7ff6c5c344d6  (/opt/ros/iron/lib/librclcpp.so+0xee4d6) (BuildId: 30853bd50c7eab48c22306ef21235a4f7e60180e)
2024-05-24T20:17:09.6671688Z     
2024-05-24T20:17:09.6672109Z     AddressSanitizer can not provide additional info.
2024-05-24T20:17:09.6673848Z     SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0x97ef4) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348) 
2024-05-24T20:17:09.6674929Z     ==44942==ABORTING
audrow commented 3 weeks ago

FYI @MiguelCompany and @EduPonz

fujitatomoya commented 3 weeks ago

this seems that https://github.com/ros2/rmw_fastrtps/issues/761 could be related but this one happens with iron and humble, so can be another issue.

CC: @Barry-Xu-2018

Barry-Xu-2018 commented 3 weeks ago

@luca-della-vedova

I try to reproduce this problem on my environment (docker image: osrf/ros:iron-desktop-full-jammy).
But I cannot get the same error log as yours.

If run colcon test --packages-up-to rmf_fleet_adapter, “1: AddressSanitizer:DEADLYSIGNAL” keeps being output while executing test_rmf_utils

    Start 1: test_rmf_utils

1: Test command: /usr/bin/python3.10 "-u" "/opt/ros/iron/share/ament_cmake_test/cmake/run_test.py" "/root/iron_rmf_ws/build/rmf_utils/test_results/rmf_utils/test_rmf_utils.catch2.xml" "--package-name" "rmf_utils" "--output-file" "/root/iron_rmf_ws/build/rmf_utils/ament_cmake_catch2/test_rmf_utils.txt" "--command" "/root/iron_rmf_ws/build/rmf_utils/test_rmf_utils" "-r junit" "-o /root/iron_rmf_ws/build/rmf_utils/test_results/rmf_utils/test_rmf_utils.catch2.xml"
1: Test timeout computed to be: 300
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_site_map_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_site_map_msgs/test_results/rmf_site_map_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_site_map_msgs/rosidl_generator_c/rmf_site_map_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_fleet_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_fleet_msgs/test_results/rmf_fleet_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_fleet_msgs/rosidl_generator_c/rmf_fleet_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_dispenser_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_dispenser_msgs/test_results/rmf_dispenser_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_dispenser_msgs/rosidl_generator_c/rmf_dispenser_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_door_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_door_msgs/test_results/rmf_door_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_door_msgs/rosidl_generator_c/rmf_door_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_traffic_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_traffic_msgs/test_results/rmf_traffic_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_traffic_msgs/rosidl_generator_c/rmf_traffic_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_building_map_msgs/rmf_building_map_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_building_map_msgs/test_results/rmf_building_map_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_building_map_msgs/rosidl_generator_c/rmf_building_map_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/src/rmf/rmf_internal_msgs/rmf_lift_msgs':
1:  - /opt/ros/iron/bin/ament_cppcheck --xunit-file /root/iron_rmf_ws/build/rmf_lift_msgs/test_results/rmf_lift_msgs/cppcheck_rosidl_generated_c.xunit.xml /root/iron_rmf_ws/build/rmf_lift_msgs/rosidl_generator_c/rmf_lift_msgs
1: -- run_test.py: invoking following command in '/root/iron_rmf_ws/build/rmf_utils':
1:  - /root/iron_rmf_ws/build/rmf_utils/test_rmf_utils -r junit -o /root/iron_rmf_ws/build/rmf_utils/test_results/rmf_utils/test_rmf_utils.catch2.xml
1: AddressSanitizer:DEADLYSIGNAL
1: AddressSanitizer:DEADLYSIGNAL
1: AddressSanitizer:DEADLYSIGNAL
1: AddressSanitizer:DEADLYSIGNAL
1: AddressSanitizer:DEADLYSIGNAL
1: AddressSanitizer:DEADLYSIGNAL
...

If run colcon test --packages-select rmf_fleet_adapter, I get below output

    Start 2: test_rmf_fleet_adapter

2: Test command: /usr/bin/python3.10 "-u" "/opt/ros/iron/share/ament_cmake_test/cmake/run_test.py" "/root/iron_rmf_ws/build/rmf_fleet_adapter/test_results/rmf_fleet_adapter/test_rmf_fleet_adapter.catch2.xml" "--package-name" "rmf_fleet_adapter" "--output-file" "/root/iron_rmf_ws/build/rmf_fleet_adapter/ament_cmake_catch2/test_rmf_fleet_adapter.txt" "--command" "/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter" "-r junit" "-o /root/iron_rmf_ws/build/rmf_fleet_adapter/test_results/rmf_fleet_adapter/test_rmf_fleet_adapter.catch2.xml"
2: Test timeout computed to be: 300
2: -- run_test.py: invoking following command in '/root/iron_rmf_ws/build/rmf_fleet_adapter':
2:  - /root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter -r junit -o /root/iron_rmf_ws/build/rmf_fleet_adapter/test_results/rmf_fleet_adapter/test_rmf_fleet_adapter.catch2.xml
2: =================================================================
2: ==43572==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x606000120e00 in thread T0:
2:   object passed to delete has wrong type:
2:   size of the allocated type:   64 bytes;
2:   size of the deallocated type: 1 bytes.
2:     #0 0x706b7ff3424f in operator delete(void*, unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:172
2:     #1 0x59d0f715991a in __gnu_cxx::new_allocator<char>::deallocate(char*, unsigned long) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x138d91a)
2:     #2 0x59d0f7137260 in std::allocator_traits<std::allocator<char> >::deallocate(std::allocator<char>&, char*, unsigned long) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x136b260)
2:     #3 0x59d0f737bed3 in void rclcpp::allocator::retyped_deallocate<char, std::allocator<char> >(void*, void*) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x15afed3)
2:     #4 0x706b7459f5ce in rcutils_string_map_fini (/opt/ros/iron/lib/librcutils.so+0xe5ce)
2:     #5 0x706b744eaaf5  (/opt/ros/iron/lib/librcl.so+0x27af5)
2:     #6 0x706b744eae47 in rcl_node_resolve_name (/opt/ros/iron/lib/librcl.so+0x27e47)
2:     #7 0x706b744eafc5 in rcl_publisher_init (/opt/ros/iron/lib/librcl.so+0x27fc5)
2:     #8 0x706b7478ddb5 in rclcpp::PublisherBase::PublisherBase(rclcpp::node_interfaces::NodeBaseInterface*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rosidl_message_type_support_t const&, rcl_publisher_options_s const&, rclcpp::PublisherEventCallbacks const&, bool) (/opt/ros/iron/lib/librclcpp.so+0x18ddb5)
2:     #9 0x706b747496e2 in std::_Function_handler<std::shared_ptr<rclcpp::PublisherBase> (rclcpp::node_interfaces::NodeBaseInterface*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::QoS const&), rclcpp::create_publisher_factory<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> >, std::allocator<void>, rclcpp::Publisher<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> >, std::allocator<void> > >(rclcpp::PublisherOptionsWithAllocator<std::allocator<void> > const&)::{lambda(rclcpp::node_interfaces::NodeBaseInterface*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::QoS const&)#1}>::_M_invoke(std::_Any_data const&, rclcpp::node_interfaces::NodeBaseInterface*&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::QoS const&) (/opt/ros/iron/lib/librclcpp.so+0x1496e2)
2:     #10 0x706b7473d547 in rclcpp::node_interfaces::NodeParameters::NodeParameters(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>, std::shared_ptr<rclcpp::node_interfaces::NodeLoggingInterface>, std::shared_ptr<rclcpp::node_interfaces::NodeTopicsInterface>, std::shared_ptr<rclcpp::node_interfaces::NodeServicesInterface>, std::shared_ptr<rclcpp::node_interfaces::NodeClockInterface>, std::vector<rclcpp::Parameter, std::allocator<rclcpp::Parameter> > const&, bool, bool, rclcpp::QoS const&, rclcpp::PublisherOptionsBase const&, bool, bool) (/opt/ros/iron/lib/librclcpp.so+0x13d547)
2:     #11 0x706b7472ea76 in rclcpp::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/opt/ros/iron/lib/librclcpp.so+0x12ea76)
2:     #12 0x706b7472f827 in rclcpp::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/opt/ros/iron/lib/librclcpp.so+0x12f827)
2:     #13 0x706b7ed29560 in rmf_rxcpp::Transport::Transport(rxcpp::schedulers::worker, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x420c560)
2:     #14 0x706b7ed24161 in rmf_fleet_adapter::agv::Node::Node(rxcpp::schedulers::worker, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x4207161)
2:     #15 0x706b7ed22618 in rmf_fleet_adapter::agv::Node::make(rxcpp::schedulers::worker, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x4205618)
2:     #16 0x706b7f2dd125 in rmf_fleet_adapter::agv::test::MockAdapter::Implementation::Implementation(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x47c0125)
2:     #17 0x706b7f2dfc79 in rmf_utils::unique_impl_ptr<rmf_fleet_adapter::agv::test::MockAdapter::Implementation, void (*)(rmf_fleet_adapter::agv::test::MockAdapter::Implementation*)> rmf_utils::make_unique_impl<rmf_fleet_adapter::agv::test::MockAdapter::Implementation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x47c2c79)
2:     #18 0x706b7f2d2808 in rmf_fleet_adapter::agv::test::MockAdapter::MockAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rclcpp::NodeOptions const&) (/root/iron_rmf_ws/install/rmf_fleet_adapter/lib/librmf_fleet_adapter.so+0x47b5808)
2:     #19 0x59d0f727a6e6 in void __gnu_cxx::new_allocator<rmf_fleet_adapter::agv::test::MockAdapter>::construct<rmf_fleet_adapter::agv::test::MockAdapter, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(rmf_fleet_adapter::agv::test::MockAdapter*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x14ae6e6)
2:     #20 0x59d0f72771a1 in void std::allocator_traits<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter> >::construct<rmf_fleet_adapter::agv::test::MockAdapter, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>&, rmf_fleet_adapter::agv::test::MockAdapter*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x14ab1a1)
2:     #21 0x59d0f726f24a in std::_Sp_counted_ptr_inplace<rmf_fleet_adapter::agv::test::MockAdapter, std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x14a324a)
2:     #22 0x59d0f726512c in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<rmf_fleet_adapter::agv::test::MockAdapter, std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(rmf_fleet_adapter::agv::test::MockAdapter*&, std::_Sp_alloc_shared_tag<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x149912c)
2:     #23 0x59d0f724d11b in std::__shared_ptr<rmf_fleet_adapter::agv::test::MockAdapter, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::_Sp_alloc_shared_tag<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x148111b)
2:     #24 0x59d0f723f164 in std::shared_ptr<rmf_fleet_adapter::agv::test::MockAdapter>::shared_ptr<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::_Sp_alloc_shared_tag<std::allocator<rmf_fleet_adapter::agv::test::MockAdapter> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x1473164)
2:     #25 0x59d0f722972c in std::shared_ptr<rmf_fleet_adapter::agv::test::MockAdapter> std::allocate_shared<rmf_fleet_adapter::agv::test::MockAdapter, std::allocator<rmf_fleet_adapter::agv::test::MockAdapter>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::allocator<rmf_fleet_adapter::agv::test::MockAdapter> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x145d72c)
2:     #26 0x59d0f721ee75 in std::shared_ptr<rmf_fleet_adapter::agv::test::MockAdapter> std::make_shared<rmf_fleet_adapter::agv::test::MockAdapter, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, rclcpp::NodeOptions&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, rclcpp::NodeOptions&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x1452e75)
2:     #27 0x59d0f7208ba8 in rmf_fleet_adapter::phases::test::MockAdapterFixture::MockAdapterFixture() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x143cba8)
2:     #28 0x59d0f72d81c5 in rmf_fleet_adapter::phases::test::(anonymous namespace)::C_A_T_C_H_T_E_S_T_0::C_A_T_C_H_T_E_S_T_0() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x150c1c5)
2:     #29 0x59d0f72d827f in Catch::TestInvokerAsMethod<rmf_fleet_adapter::phases::test::(anonymous namespace)::C_A_T_C_H_T_E_S_T_0>::invoke() const (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x150c27f)
2:     #30 0x59d0f7082089 in Catch::TestCase::invoke() const (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12b6089)
2:     #31 0x59d0f70744a5 in Catch::RunContext::invokeActiveTestCase() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12a84a5)
2:     #32 0x59d0f7073c4a in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12a7c4a)
2:     #33 0x59d0f7070436 in Catch::RunContext::runTest(Catch::TestCase const&) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12a4436)
2:     #34 0x59d0f70787b5 in Catch::(anonymous namespace)::TestGroup::execute() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12ac7b5)
2:     #35 0x59d0f707b42b in Catch::Session::runInternal() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12af42b)
2:     #36 0x59d0f707ad28 in Catch::Session::run() (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12aed28)
2:     #37 0x59d0f71102f1 in int Catch::Session::run<char>(int, char const* const*) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x13442f1)
2:     #38 0x59d0f70b2bdb in main (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x12e6bdb)
2:     #39 0x706b70ceed8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
2:     #40 0x706b70ceee3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
2:     #41 0x59d0f7048484 in _start (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x127c484)
2: 
2: 0x606000120e00 is located 0 bytes inside of 64-byte region [0x606000120e00,0x606000120e40)
2: allocated by thread T0 here:
2:     #0 0x706b7ff331e7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99
2:     #1 0x59d0f7159983 in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x138d983)
2:     #2 0x59d0f713728f in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x136b28f)
2:     #3 0x59d0f737bd4b in void* rclcpp::allocator::retyped_allocate<std::allocator<char> >(unsigned long, void*) (/root/iron_rmf_ws/build/rmf_fleet_adapter/test_rmf_fleet_adapter+0x15afd4b)
2:     #4 0x706b7459f3d2 in rcutils_string_map_init (/opt/ros/iron/lib/librcutils.so+0xe3d2)
2: 
2: SUMMARY: AddressSanitizer: new-delete-type-mismatch ../../../../src/libsanitizer/asan/asan_new_delete.cpp:172 in operator delete(void*, unsigned long)
2: ==43572==HINT: if you don't care about these errors you may set ASAN_OPTIONS=new_delete_type_mismatch=0
2: ==43572==ABORTING

Could you provide your complete log?

luca-della-vedova commented 3 weeks ago

I added the log to a gist here. Indeed there are a lot of new-delete-type-mismatch errors but I believe those have always been around.

We do set in our CI the ASAN_OPTIONS environment variable to avoid these spurious failures:

ASAN_OPTIONS: detect_leaks=0:new_delete_type_mismatch=0

This should hopefully make the test run long enough that it will trigger the issue

Barry-Xu-2018 commented 3 weeks ago

Indeed there are a lot of new-delete-type-mismatch errors but I believe those have always been around.

Yes. This is a known issue. (https://github.com/ros2/rclcpp/issues/2220)

We do set in our CI the ASAN_OPTIONS environment variable to avoid these spurious failures

Thanks. I get the same error.

2: [INFO] [1717746840.951931995] [test_Delivery]: Executing go_to_place [dropoff] for robot [test_fleet/T0]
2: cannot publish data, at ./src/rmw_publish.cpp:62 during '__function__'
2: Fail in delete datareader, at ./src/rmw_service.cpp:100 during '__function__'
2: AddressSanitizer:DEADLYSIGNAL
2: =================================================================
2: ==185==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000007f0 (pc 0x7c812d256ef4 bp 0x60c0006d9a00 sp 0x7ffc3f3cc5e8 T0)
2: ==185==The signal is caused by a READ memory access.
2: ==185==Hint: address points to the zero page.
2:     #0 0x7c812d256ef4 in pthread_mutex_lock (/lib/x86_64-linux-gnu/libc.so.6+0x97ef4)
2:     #1 0x7c812772f20d in eprosima::fastdds::dds::DomainParticipantImpl::find_type(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const (/opt/ros/iron/lib/libfastrtps.so.2.10+0x34820d)
2:     #2 0x7c8127731735 in eprosima::fastdds::dds::DomainParticipantImpl::unregister_type(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (/opt/ros/iron/lib/libfastrtps.so.2.10+0x34a735)
2:     #3 0x7c8127ce28ed in rmw_fastrtps_shared_cpp::__rmw_destroy_service(char const*, rmw_node_s*, rmw_service_s*) (/opt/ros/iron/lib/librmw_fastrtps_shared_cpp.so+0x4e8ed)
2:     #4 0x7c81309e889b in rcl_service_fini (/opt/ros/iron/lib/librcl.so+0x2b89b)
2:     #5 0x7c81309e8c52 in rcl_node_type_description_service_fini (/opt/ros/iron/lib/librcl.so+0x2bc52)
2:     #6 0x7c8130c461f3  (/opt/ros/iron/lib/librclcpp.so+0x14c1f3)
2:     #7 0x7c8130c274e7  (/opt/ros/iron/lib/librclcpp.so+0x12d4e7)
2:     #8 0x7c8130c25fe6  (/opt/ros/iron/lib/librclcpp.so+0x12bfe6)
2:     #9 0x7c812d204a55 in __cxa_finalize (/lib/x86_64-linux-gnu/libc.so.6+0x45a55)
2:     #10 0x7c8130be84d6  (/opt/ros/iron/lib/librclcpp.so+0xee4d6)
2: 
2: AddressSanitizer can not provide additional info.
2: SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0x97ef4) in pthread_mutex_lock
2: ==185==ABORTING
MiguelCompany commented 3 weeks ago

@luca-della-vedova @Barry-Xu-2018 since this is inside __cxa_finalize, I guess it has to do with the destruction order of static global objects. So I agree with https://github.com/ros2/rmw_fastrtps/issues/760#issuecomment-2153124168 that this is (in a way) similar to #761

Barry-Xu-2018 commented 3 weeks ago

@MiguelCompany

For #761, I found a bug in the Executor. It causes shared pointers to resources not to be released in time. Considering that the Jazzy version of the Executor has significant changes, I am uncertain whether the Iron version of the Executor has a similar issue.

Barry-Xu-2018 commented 3 weeks ago

The issue is related to test Test Delivery in rmf_fleet_adapter/test/tasks/test_Delivery.cpp.

For cannot publish data, at ./src/rmw_publish.cpp:62 during '__function__', backstrace output

#0  rmw_fastrtps_shared_cpp::__rmw_publish (identifier=0x7fffe8c86192 "rmw_fastrtps_cpp", publisher=0x555556095420, ros_message=0x7fffffff5140, allocation=0x0)
    at /root/ros2_iron_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_publish.cpp:62
#1  0x00007fffe8b953c6 in rmw_fastrtps_shared_cpp::__rmw_destroy_service (identifier=0x7fffe8c86192 "rmw_fastrtps_cpp", node=0x5555560aa5d0, service=0x555556211690)
    at /root/ros2_iron_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_service.cpp:76
#2  0x00007fffe8c6e8d0 in rmw_destroy_service (node=0x5555560aa5d0, service=0x555556211690) at /root/ros2_iron_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/src/rmw_service.cpp:506
#3  0x00007fffeb44dc1f in rmw_destroy_service (v2=0x5555560aa5d0, v1=0x555556211690) at /root/ros2_iron_ws/src/ros2/rmw_implementation/rmw_implementation/src/functions.cpp:523
#4  0x00007fffeeda0996 in rcl_service_fini (service=0x555555f452e8, node=0x555555f45200) at /root/ros2_iron_ws/src/ros2/rcl/rcl/src/rcl/service.c:260
#5  0x00007fffeed99af3 in rcl_node_type_description_service_fini (node=0x555555f45200) at /root/ros2_iron_ws/src/ros2/rcl/rcl/src/rcl/node.c:673
#6  0x00007fffefa15127 in rclcpp::node_interfaces::NodeTypeDescriptions::NodeTypeDescriptionsImpl::~NodeTypeDescriptionsImpl (this=0x5555561f6630, __in_chrg=<optimized out>)
    at /root/ros2_iron_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/node_interfaces/node_type_descriptions.cpp:132
#7  0x00007fffefa159f4 in std::default_delete<rclcpp::node_interfaces::NodeTypeDescriptions::NodeTypeDescriptionsImpl>::operator() (this=0x5555561dda18, __ptr=0x5555561f6630)
    at /usr/include/c++/11/bits/unique_ptr.h:85
#8  0x00007fffefa156f0 in std::unique_ptr<rclcpp::node_interfaces::NodeTypeDescriptions::NodeTypeDescriptionsImpl, std::default_delete<rclcpp::node_interfaces::NodeTypeDescriptions::NodeTypeDescriptionsImpl> >::~unique_ptr (this=0x5555561dda18, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:361
#9  0x00007fffefa10d12 in rclcpp::node_interfaces::NodeTypeDescriptions::~NodeTypeDescriptions (this=0x5555561dda10, __in_chrg=<optimized out>)
    at /root/ros2_iron_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/node_interfaces/node_type_descriptions.cpp:154
#10 0x00007fffef9b108f in __gnu_cxx::new_allocator<rclcpp::node_interfaces::NodeTypeDescriptions>::destroy<rclcpp::node_interfaces::NodeTypeDescriptions> (this=0x5555561dda10, __p=0x5555561dda10)
    at /usr/include/c++/11/ext/new_allocator.h:168
#11 0x00007fffef9b1005 in std::allocator_traits<std::allocator<rclcpp::node_interfaces::NodeTypeDescriptions> >::destroy<rclcpp::node_interfaces::NodeTypeDescriptions> (__a=..., __p=0x5555561dda10)
    at /usr/include/c++/11/bits/alloc_traits.h:535
#12 0x00007fffef9b09a5 in std::_Sp_counted_ptr_inplace<rclcpp::node_interfaces::NodeTypeDescriptions, std::allocator<rclcpp::node_interfaces::NodeTypeDescriptions>, (__gnu_cxx::_Lock_policy)2>::_M_dispose
    (this=0x5555561dda00) at /usr/include/c++/11/bits/shared_ptr_base.h:528
#13 0x0000555555ad9495 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x5555561dda00) at /usr/include/c++/11/bits/shared_ptr_base.h:168
#14 0x0000555555ac8d7d in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x5555561f6538, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:705
#15 0x00007fffef9a5848 in std::__shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x5555561f6530, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1154
#16 0x00007fffef9a5892 in std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface>::~shared_ptr (this=0x5555561f6530, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr.h:122
#17 0x00007fffef9aee54 in std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >::~pair (this=0x5555561f6528, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/stl_pair.h:211
#18 0x00007fffef9aee78 in __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, false> >::destroy<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> > > (this=0x7fffefdab100 <rclcpp::Node::backport_members_>, __p=0x5555561f6528)
    at /usr/include/c++/11/ext/new_allocator.h:168
#19 0x00007fffef9adcaf in std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, false> > >::destroy<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> > > (__a=..., __p=0x5555561f6528) at /usr/include/c++/11/bits/alloc_traits.h:535
#20 0x00007fffef9ac7fb in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, false> > >::_M_deallocate_node (this=0x7fffefdab100 <rclcpp::Node::backport_members_>, __n=0x5555561f6520) at /usr/include/c++/11/bits/hashtable_policy.h:1894
#21 0x00007fffef9ab129 in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, false> > >::_M_deallocate_nodes (this=0x7fffefdab100 <rclcpp::Node::backport_members_>, __n=0x0) at /usr/include/c++/11/bits/hashtable_policy.h:1916
#22 0x00007fffef9a95b2 in std::_Hashtable<rclcpp::Node const*, std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, std::allocator<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> > >, std::__detail::_Select1st, std::equal_to<rclcpp::Node const*>, std::hash<rclcpp::Node const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::clear (
    this=0x7fffefdab100 <rclcpp::Node::backport_members_>) at /usr/include/c++/11/bits/hashtable.h:2320
#23 0x00007fffef9a6d70 in std::_Hashtable<rclcpp::Node const*, std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> >, std::allocator<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> > >, std::__detail::_Select1st, std::equal_to<rclcpp::Node const*>, std::hash<rclcpp::Node const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::~_Hashtable (
    this=0x7fffefdab100 <rclcpp::Node::backport_members_>, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/hashtable.h:1532
--Type <RET> for more, q to quit, c to continue without paging--
#24 0x00007fffef9a53c8 in std::unordered_map<rclcpp::Node const*, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface>, std::hash<rclcpp::Node const*>, std::equal_to<rclcpp::Node const*>, std::allocator<std::pair<rclcpp::Node const* const, std::shared_ptr<rclcpp::node_interfaces::NodeTypeDescriptionsInterface> > > >::~unordered_map (this=0x7fffefdab100 <rclcpp::Node::backport_members_>, 
    __in_chrg=<optimized out>) at /usr/include/c++/11/bits/unordered_map.h:102
#25 0x00007fffef9a54e7 in rclcpp::Node::BackportMembers::~BackportMembers (this=0x7fffefdab100 <rclcpp::Node::backport_members_>, __in_chrg=<optimized out>)
    at /root/ros2_iron_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/node.cpp:132
#26 0x00007fffecd92a56 in __cxa_finalize (d=0x7fffefda9a80) at ./stdlib/cxa_finalize.c:83
#27 0x00007fffef8eec57 in __do_global_dtors_aux () from /root/ros2_iron_ws/install/rclcpp/lib/librclcpp.so
#28 0x00007fffffff8490 in ?? ()
#29 0x00007ffff7fc924e in _dl_fini () at ./elf/dl-fini.c:142

Node::BackportMembers is a static variable in rclcpp::Node. So it is destructed at __cxa_finalize. https://github.com/ros2/rclcpp/blob/c1a01fc08d728bcf06f3632939b1b6c06f1b99a2/rclcpp/include/rclcpp/node.hpp#L1604

When the Node is destructed, it will call backport_members_.remove(this).
The current issue is why the Node test_Delivery wasn't released.
Further investigation requires understanding the relevant test code first. However, I'm not familiar with the rmf_fleet_adapter code. Can anyone provide assistance?

MiguelCompany commented 3 weeks ago

@Barry-Xu-2018 @luca-della-vedova Could you check if #770 fixes this?

Barry-Xu-2018 commented 2 weeks ago

@MiguelCompany

Thanks.

Could you check if https://github.com/ros2/rmw_fastrtps/pull/770 fixes this?

After checking, the below error message doesn't occur.

2: cannot publish data, at ./src/rmw_publish.cpp:62 during '__function__'
2: Fail in delete datareader, at ./src/rmw_service.cpp:100 during '__function__'

For test Test Delivery, the destructor function of Node test_Delivery is also never called. After calling destructor function of Node::BackportMembers (Note that it is a static variable in rclcpp::Node.), the resource (service, publisher, etc) of Node test_Delivery are released. So I think there is another issue.

fujitatomoya commented 2 weeks ago

@Barry-Xu-2018

I think this issue is originally generated by https://github.com/ros2/rclcpp/pull/2224.

maybe https://github.com/ros2/rcl/pull/1112, https://github.com/ros2/rclcpp/pull/2344 and https://github.com/ros2/rclcpp/pull/2351 could address the issue, but i am not 100% sure.

since https://github.com/ros2/rcl/pull/1112 requires API change, backport is not straight forward.

@MiguelCompany i will try to take a look.

Barry-Xu-2018 commented 2 weeks ago

@fujitatomoya

I think this issue is originally generated by https://github.com/ros2/rclcpp/pull/2224.

That could be it. The error occurred while releasing service /test_Delivery/get_type_description.