ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
560 stars 423 forks source link

test_publisher run_event_handler test fails with CycloneDDS #1475

Closed sloretz closed 3 years ago

sloretz commented 3 years ago

Bug report

Required Info:

Steps to reproduce issue

RMW_IMPLEMENTATION=rmw_cyclonedds_cpp build/rclcpp/test/rclcpp/test_publisher

Expected behavior

All tests should pass

Actual behavior

[ RUN      ] TestPublisher.run_event_handlers
1606261998.534775 [0] test_publi: using network interface enp3s0 (udp/192.168.1.150) selected arbitrarily from: enp3s0, docker0
[WARN] [1606261998.541156630] [ns.my_node]: New subscription discovered on topic '/ns/topic', requesting incompatible QoS. No messages will be sent to it. Last incompatible policy: INVALID_QOS_POLICY
/home/sloretz/bigssd/ros2/src/ros2/rclcpp/rclcpp/test/rclcpp/test_publisher.cpp:509: Failure
Expected: handler->execute(data) throws an exception of type std::runtime_error.
  Actual: it throws nothing.
[  FAILED  ] TestPublisher.run_event_handlers (11 ms)

Additional information

The test started failing since ros2/rclcpp#1241, and reverting to the commit before resolves the test failure, but it's not clear to me if that PR introduced a bug or exposed an existing bug. Before the PR the test seemed to run the event handlers execute() method without any expectations. That PR changed the test to expect no events on a newly created publisher. When run with rmw_cyclonedds_cpp there is an event.

Stepping through rcl_take_event() shows RMW_EVENT_OFFERED_QOS_INCOMPATIBLE get's returned. Unfortunately I can't trace further. gdb refuses to step inside dds_get_offered_incompatible_qos_status(...). It appears to be defined by a macro STATUS_CB_IMPL in dds_writer.c.

It does not appear related to test_publisher.cpp reusing the same topic name between tests. The same error happens when the topic name is changed to something that does not exist elsewhere

[ RUN      ] TestPublisher.run_event_handlers
1606262628.746126 [0] test_publi: using network interface enp3s0 (udp/192.168.1.150) selected arbitrarily from: enp3s0, docker0
[WARN] [1606262628.752262631] [ns.my_node]: New subscription discovered on topic '/ns/topic_does_not_exist', requesting incompatible QoS. No messages will be sent to it. Last incompatible policy: INVALID_QOS_POLICY
/home/sloretz/bigssd/ros2/src/ros2/rclcpp/rclcpp/test/rclcpp/test_publisher.cpp:509: Failure
Expected: handler->execute(data) throws an exception of type std::runtime_error.
  Actual: it throws nothing.
[  FAILED  ] TestPublisher.run_event_handlers (11 ms)
sloretz commented 3 years ago

@ivanpauno Mind taking a look since you're familiar with ros2/rclcpp#1241 and also a maintainer of rmw_cyclonedds?

alsora commented 3 years ago

The same problem happens to me when running a pub-sub single process application that uses intra-process communication. I can reproduce with both cyclonedds and fastdds. Disabling IPC makes the error disappear.

From the backtrace

terminate called after throwing an instance of 'std::runtime_error'
  what():  'data' is empty

Thread 13 "irobot-benchmar" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe5a74700 (LWP 12231)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff6fed859 in __GI_abort () at abort.c:79
#2  0x00007ffff73e5951 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff73f147c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff73f14e7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff73f1799 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7e896fb in rclcpp::experimental::SubscriptionIntraProcess<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> >, std::allocator<void>, std::default_delete<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> > >, rcl_interfaces::msg::ParameterEvent_<std::allocator<void> > >::execute_impl<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> > > (this=0x5555558ac580, data=std::shared_ptr<void> (empty) = {...}) at /root/ros2_ws/src/ros2/rclcpp/rclcpp/include/rclcpp/experimental/subscription_intra_process.hpp:188
#7  0x00007ffff7e844d5 in rclcpp::experimental::SubscriptionIntraProcess<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> >, std::allocator<void>, std::default_delete<rcl_interfaces::msg::ParameterEvent_<std::allocator<void> > >, rcl_interfaces::msg::ParameterEvent_<std::allocator<void> > >::execute (this=0x5555558ac580, 
    data=std::shared_ptr<void> (empty) = {...}) at /root/ros2_ws/src/ros2/rclcpp/rclcpp/include/rclcpp/experimental/subscription_intra_process.hpp:144
#8  0x00007ffff7d4065e in rclcpp::executors::StaticSingleThreadedExecutor::execute_ready_executables (this=0x5555556b83c0)
ivanpauno commented 3 years ago

@alsora can you share a minimal reproducible example for that issue?

Though both have been probably introduced in the same PR, I don't think they are exactly the same issue.