ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
562 stars 425 forks source link

Posible bug: `spin_some` only processes one message per topic even if multiple messages are in the queue #2655

Open RaphLins opened 1 month ago

RaphLins commented 1 month ago

Bug report

Required Info:

Steps to reproduce issue

A topic is being published at a high frequency while a subscribed node calls rclcpp::spin_some() at a lower frequency.

#include <rclcpp/node.hpp>
#include <rclcpp/executors.hpp>
#include <std_msgs/msg/string.hpp>

int main(int argc, char *argv[]) {
  rclcpp::init(argc, argv);

  rclcpp::Node::SharedPtr pub_node = std::make_shared<rclcpp::Node>("publisher");
  auto pub = pub_node->create_publisher<std_msgs::msg::String>("topic", 100);
  int count = 0;
  auto timer_pub = pub_node->create_wall_timer(std::chrono::milliseconds(100), [&]() -> void {
    std_msgs::msg::String msg;
    msg.data = std::to_string(count);
    count++;
    pub->publish(msg);
    std::cout << "Published: " << msg.data << std::endl;
  });
  std::thread pub_node_thread([&]() {
    rclcpp::spin(pub_node);
  });

  rclcpp::Node::SharedPtr sub_node = std::make_shared<rclcpp::Node>("subscriber");
  auto sub =
      sub_node->create_subscription<std_msgs::msg::String>("topic", 100, [](std_msgs::msg::String::ConstSharedPtr msg) {
        std::cout << "Received: " << msg->data << std::endl;
      });

  while (rclcpp::ok()) {
    std::cout << "Running spin_some" << std::endl;
    rclcpp::spin_some(sub_node);
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }

  pub_node_thread.join();
  rclcpp::shutdown();

  return 0;
}

Expected behavior

According to its description, rclcpp::spin_some should execute "any immediately available work", so I would expect it to process all the available messages in the subscriber queue.

Running spin_some
Published: 0
Published: 1
Published: 2
Published: 3
Published: 4
Published: 5
Published: 6
Published: 7
Published: 8
Published: 9
Running spin_some
Received: 0
Received: 1
Received: 2
Received: 3
Received: 4
Received: 5
Received: 6
Received: 7
Received: 8
Received: 9
Published: 10
Published: 11
...

Actual behavior

It only processes one message in the queue at a time, causing it to accumulate.

Running spin_some
Published: 0
Published: 1
Published: 2
Published: 3
Published: 4
Published: 5
Published: 6
Published: 7
Published: 8
Published: 9
Running spin_some
Received: 0
Published: 10
Published: 11
...

Additional information


fujitatomoya commented 1 month ago

@RaphLins thanks for creating issue. it looks like this is not expected behavior. I would like to ask you to check if the problem stays (the same behavior) with jazzy and humble as well. that would be useful information if some changes in rolling (or jazzy) if the problem is not observed in humble.

RaphLins commented 1 month ago

@fujitatomoya Thank you for the feedback.

that would be useful information if some changes in rolling (or jazzy) if the problem is not observed in humble.

I've checked and can confirm that the problem is also observed in both the latest humble (v16.0.10) and jazzy.

Barry-Xu-2018 commented 1 month ago

I found Iron version has the issue, and it's unrelated to the DDS type (Cyclone DDS).

After checking code, I found

Executor::get_next_ready_executable() call waitresult->next_ready_subscription(). If the subscription has messages, it will generate an any_executable. It doesn't matter how many messages there are to receive for that subscription.

https://github.com/ros2/rclcpp/blob/f12e3c69dcc5823b474d43ae1d7bf3c596154cbd/rclcpp/src/rclcpp/executor.cpp#L800-L813

https://github.com/ros2/rclcpp/blob/f12e3c69dcc5823b474d43ae1d7bf3c596154cbd/rclcpp/include/rclcpp/wait_result.hpp#L208-L225

Executor::execute_any_executable() call Executor::execute_subscription(). In the processing code, it will only retrieve one message.
spin_some ensures that different subscriptions, clients, services, etc., with data will each be processed once, but only one data will be processed for the same target.

This is my understanding; please correct me if I'm wrong.

fujitatomoya commented 3 weeks ago

although this is not really user expectation described in this issue report header, this is designed behavior as @Barry-Xu-2018 explains above. rclcpp::spin_some only waits on once to collect executables, and executes them once then exits. (it does not have any idea how many data are available from this wait.) i see this is not ideal for user expectation, but for supporting we probably need new rmw interface to get the information how many data are available on wait. right now we do not have bandwidth for this, so i will move this to backlog for now.

@RaphLins can you use rclcpp::spin_all instead? i think that is what you are looking for.

RaphLins commented 2 weeks ago

@fujitatomoya Yes, I believe that rclcpp::spin_all would work for me (or rather rclcpp::Executor::spin_all() before jazzy). Thank you; I was not aware of it.

In my opinion it would be worth considering an update of the description of spin_some since I assume that other people might be using it incorectly as I did and are experiencing delayed callbacks (corresponding to the subscriber's history depth) without realizing it. Although this is only an issue if the publisher's frequency is higher than the subscriber's loop.