ros2 / rclpy

rclpy (ROS Client Library for Python)
Apache License 2.0
304 stars 225 forks source link

count_publishers has inconsistent behavior #1046

Open skemp117 opened 1 year ago

skemp117 commented 1 year ago

Bug report

When trying to use the count_publishers function, it will occasionally return 0 subscribers when the cli commands ros2 topic list never fails to report the topics exist. I am using this method to determine if topics are being published correctly, but it returns false negatives ~50% of the time. Restarting the publishers sometimes seems to affect it, but most times it just needs to be re run a few times to work. Sorry, I am relatively new to ROS. Maybe if I rewrote this to check the list of topic names and see if any topics match what I am searching for it would work better, but I would still like to know why the count_publishers does not work as I'd expect.

Required Info:

Steps to reproduce issue

Just running the following node from a launch file.

`#!/usr/bin/env python3

import rclpy
from rclpy.node import Node
from kromek_sigma50_interface.msg import Channels 
from vicon_receiver.msg import Position
from scipy import io
import numpy as np
from ament_index_python.packages import get_package_share_directory
from datetime import datetime
import os
import rosidl_runtime_py as ridl

class MinimalSubscriber(Node):
    def __init__(self):
        super().__init__('minimal_subscriber')
        self.done = False

        self.count = 1
        self.vicon_received = False

        # Setup parameters
        self.declare_parameter('final_count')
        self.declare_parameter('vicon_topic')
        self.declare_parameter('kromek_topic')
        self.declare_parameter('save_dir')

        self.final_count = self.get_parameter('final_count').value
        self.vicon_topic = "%s" % self.get_parameter('vicon_topic').value
        self.kromek_topic = "%s" % self.get_parameter('kromek_topic').value

        self.save_dir = '/home/megaton/megaton_ws/data/'+self.get_parameter('save_dir').value+'/'
        if not os.path.exists(self.save_dir):
            os.makedirs(self.save_dir)
        i = 1
        while os.path.exists(self.save_dir+"sample%s.mat" % i):
            i += 1
        self.filename = self.save_dir+"sample%s.mat" % i

        if self.count_publishers(self.vicon_topic) < 1:
            self.get_logger().info("\u274C NO VICON TOPIC FOUND!") 
            self.done = True
            return
        else:
            self.get_logger().info("\u2705 Vicon topic found...")

        if self.count_publishers(self.kromek_topic) < 1:
            self.get_logger().info("\u274C NO KROMEK TOPIC FOUND!") 
            self.done = True
            return
        else:
            self.get_logger().info("\u2705 Kromek topic found...")

        self.subscription1 = self.create_subscription(
            Channels,
            self.kromek_topic,
            self.kromek_callback,
            10)

        self.subscription2 = self.create_subscription(
            Position,
            self.vicon_topic,
            self.vicon_callback,
            1)

    def kromek_callback(self, msg):
        if msg.channels:
            self.get_logger().info("\u2705 Kromek %d.%s" % (msg.secs,"{0}".format(msg.nanosecs)[-9:]))
        else:
            self.get_logger().info("\u274C NO COUNT DATA: %d.%s" % (msg.secs,"{0}".format(msg.nanosecs)[-9:]))
        mdict = dict(ridl.message_to_ordereddict(msg))
        mdict['c_len'] = len(msg.channels)
        self.append_mat(mdict)
        self.count += 1
        if self.count > self.final_count:
            if not self.vicon_received:
                self.get_logger().info("\u274C NO VICON DATA!")
            self.done = True

    def vicon_callback(self, msg):
        self.get_logger().info("\u2705 Vicon x:%s y:%s z:%s" % (msg.x_trans, msg.y_trans, msg.z_trans))
        mdict = dict(ridl.message_to_ordereddict(msg))
        self.append_mat(mdict)
        self.vicon_received = True
        self.subscription2.destroy()

    def append_mat(self,mdict):
        if not os.path.exists(self.filename):
            io.savemat(self.filename,mdict)
        else:
            self.upload = io.loadmat(self.filename)
            for i in mdict.keys():
                try:
                    self.upload[i] = np.append(self.upload[i],mdict[i])
                except:
                    self.upload[i] = mdict[i]
            io.savemat(self.filename,self.upload)

def main(args=None):
    rclpy.init(args=args)

    minimal_subscriber = MinimalSubscriber()
    while not minimal_subscriber.done:
        rclpy.spin_once(minimal_subscriber)

    print("\a")

    # Destroy the node explicitly
    # (optional - otherwise it will be done automatically
    # when the garbage collector destroys the node object)
    minimal_subscriber.destroy_node()
    rclpy.shutdown()

if __name__ == '__main__':
    main()`

Expected behavior

count_publishers correctly reports the number of active publishers

Actual behavior

count_publishers returns < 1 when there are active publishers validated by ros2 topic list.

fujitatomoya commented 1 year ago

@skemp117 can you provide minimal reproducible example?

i cannot reproduce the issue with following sample.

### start count publishers and subscriber sample above
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclpy rclpy_1046 
[INFO] [1670014902.286522992] [testnode]: Publisher Count: "0"
[INFO] [1670014902.286877266] [testnode]: Subscriber Count: "0"
[INFO] [1670014902.777019518] [testnode]: Publisher Count: "0"

### start / stop demo talker and listener
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run demo_nodes_cpp talker
[INFO] [1670014916.360024558] [talker]: Publishing: 'Hello World: 1'
...<snip>

root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run demo_nodes_cpp listener
[INFO] [1670014907.781943546] [listener]: I heard: [Hello World: 3]
[INFO] [1670014908.781921019] [listener]: I heard: [Hello World: 4]
...<snip>
skemp117 commented 1 year ago

@fujitatomoya Thanks so much for replying! Here is the example. I did not have the timer callback, but that does not seem to make a difference. I am calling this from a launch file. Could that be the issue? I added the while loop and the counter which makes it more consistent, but not 100%

#!/usr/bin/env python3

import rclpy
from rclpy.node import Node

class MinimalSubscriber(Node):
    def __init__(self):
        super().__init__('minimal_subscriber')
        self.done = False

        # Setup parameters
        self.declare_parameter('vicon_topic','/chatter')
        self.vicon_topic = "%s" % self.get_parameter('vicon_topic').value

        i = 1
        while self.count_publishers(self.vicon_topic) < 1:
            i+=1
            if i > 50000:
                self.get_logger().info("\u274C NO VICON TOPIC FOUND!") 
                self.done = True
                return
        self.get_logger().info("\u2705 Vicon topic found...")

def main(args=None):
    rclpy.init(args=args)

    minimal_subscriber = MinimalSubscriber()
    while not minimal_subscriber.done:
        rclpy.spin_once(minimal_subscriber)

    print("\a")

    # Destroy the node explicitly
    # (optional - otherwise it will be done automatically
    # when the garbage collector destroys the node object)
    minimal_subscriber.destroy_node()
    rclpy.shutdown()

if __name__ == '__main__':
    main()
llapx commented 1 year ago

I have do some modification to your code, but still can not reproduce the issue, can you recheck it ?

#!/usr/bin/env python3

import rclpy
from rclpy.node import Node

class MinimalSubscriber(Node):
    def __init__(self):
        super().__init__('minimal_subscriber')
        self.done = False

        # Setup parameters
        self.declare_parameter('vicon_topic','/chatter')
        self.vicon_topic = "%s" % self.get_parameter('vicon_topic').value

        i = 1
        while rclpy.ok():
            i += 1
            if i % 5000 == 0:
                if self.count_publishers(self.vicon_topic) < 1:
                    self.get_logger().info("\u274C NO VICON TOPIC FOUND!") 
                    self.done = True
                    return
                else:
                    self.get_logger().info("\u2705 Vicon topic found...")

def main(args=None):
    rclpy.init(args=args)

    minimal_subscriber = MinimalSubscriber()
    while not minimal_subscriber.done:
        rclpy.spin_once(minimal_subscriber)

    print("\a")

    # Destroy the node explicitly
    # (optional - otherwise it will be done automatically
    # when the garbage collector destroys the node object)
    minimal_subscriber.destroy_node()
    rclpy.shutdown()

if __name__ == '__main__':
    main()
fujitatomoya commented 1 year ago

I am calling this from a launch file. Could that be the issue?

No i do not think so. the executable process space will be the same, it does not matter if it is executed via ros2 run or launch.

I added the while loop and the counter which makes it more consistent, but not 100%

as @llapx mentioned previously, I still cannot reproduce the issue either.

skemp117 commented 1 year ago

Thanks for the help everyone. I have changed the code to use a subscriber callback to set a flag when the first message is found instead of counting the subscribers. This works more reliably in my case, although I can understand why it might not work for others.

I'm not sure why you cannot reproduce the code, maybe it has something to do with my networking setup specifically, but I do not have time to debug any further. Thank you again for the help.