gramaziokohler / roslibpy

Python ROS Bridge library
MIT License
277 stars 57 forks source link

publish stucks in a multiple threading environment. #69

Closed obi-t4 closed 3 years ago

obi-t4 commented 3 years ago

Summary. publish stucks in a multiple threading environment. When use_lock = True (see below example), publish doesn't stuck.

Expected Result

What you expected. subscribe callback should be called without getting stuck.

Actual Result

What happened instead. publish stucks.

Reproduction Steps

import roslibpy
import time
import random

from threading import Thread, Lock

ros = roslibpy.Ros(host='localhost', port=9090)
ros.run()

use_lock = False
lock = Lock()
count = 0

topics = {}

# publish
def publish(payload, topic_name, topic_data_type, latch=False):
    global count

    if topic_name in topics:
        topic = topics[topic_name]
    else:
        topic = roslibpy.Topic(ros, topic_name, topic_data_type, latch=latch)
        topics[topic_name] = topic

    with lock:
        count += 1
        payload["data"]=f"{topic_name}:{count}"

    if use_lock:
        with lock:
            topic.publish(roslibpy.Message(payload))
    else:
        topic.publish(roslibpy.Message(payload))

# subscribe
def subscribe_callback(data):
    print(data)

def subscribe(topic_name, topic_data_type):
    topic = roslibpy.Topic(ros, topic_name, topic_data_type)
    topic.subscribe(subscribe_callback)

def publish_loop(payload, topic_name, topic_data_type, wait_in_sec):
    while True:
        publish(payload, topic_name, topic_data_type)
        time.sleep(wait_in_sec)

t_pub_a = Thread(target=publish_loop, args=({}, '/topic_a', 'std_msgs/String', random.random()/1000))
t_pub_b = Thread(target=publish_loop, args=({}, '/topic_b', 'std_msgs/String', random.random()/1000))
t_pub_c = Thread(target=publish_loop, args=({}, '/topic_c', 'std_msgs/String', random.random()/1000))
t_pub_d = Thread(target=publish_loop, args=({}, '/topic_d', 'std_msgs/String', random.random()/1000))

subscribe('/topic_a', 'std_msgs/String')
subscribe('/topic_b', 'std_msgs/String')
subscribe('/topic_c', 'std_msgs/String')
subscribe('/topic_d', 'std_msgs/String')

t_pub_a.start()
t_pub_b.start()
t_pub_c.start()
t_pub_d.start()

t_pub_d.join()
t_pub_c.join()
t_pub_b.join()
t_pub_a.join()

System Information

Operating System name and version, ROS version, etc. OS: ubuntu18.04 ROS: melodic

beverlylytle commented 3 years ago

Hi! Thanks for submitting an issue!

I've looked into this, and I'm wondering if I've been able to reproduce the problem or not. What I found is the following: First I start up some docker containers running ROS. I run the script you've provided. The expected messages are printed to the console. Next I flip the use_lock flag to True and run the script (without restarting the containers). The expected messages are not printed to the console (although the script is not "stuck" exactly, it does run, but the printed output is not appearing). If I then restart the docker containers, maintain use_lock = True and run the script again, the expected messages appear. Some logs within ROS seem to suggest that something is going wrong with the connection between the client and ROS, and I think it has to do with the publication rate. In fact, if I change wait_in_sec to something larger, I can run the script resulting the in the expected output multiple times in a row with use_lock as either True or False without restarting any containers. But these sorts of things are highly dependant on the machine the script and ROS is running on (https://answers.ros.org/question/318580/what-is-the-max-frequency-for-a-rospublisher-to-publish-message/). Does this help you at all?

gonzalocasas commented 3 years ago

Closing this due to inactivity, but feel free to re-open if you have additional findings/info.