fkie / multimaster_fkie

ROS stack with FKIE packages for multi-robot (discovering, synchronizing and management GUI)
BSD 3-Clause "New" or "Revised" License
267 stars 107 forks source link

Multimaster Discovery Connection Timed Out #193

Closed mericgeren closed 1 year ago

mericgeren commented 1 year ago

Hello,

I am new to ROS and robotics. I am currently trying to make a Linux pc to connect and publish some messages to a Linux VM machine running in Windows 10 using ROS and multimaster. One side using ros melodic, other side using noetic. Both of them seem to be in same network. But, when i try to launch master, master_sync and master_discovery nodes in unicast mode with a launch file (in the launch file i use robot_hosts parameter set to ip of the other computer) on each devices, i get a connection timed out error on the Linux Pc. Could you help me on this issue please?

Thanks in advance,

Meriç Geren

mericgeren commented 1 year ago

P.S. I also have tried using both versions of WSL, on WSL version1, however i managed to make Linux PC to see my host on Windows 10 once. But, when i enter: rosservice call /master_discovery/list_masters it showed me an error message which reads: ERROR: Service [/master_discovery/list_members] is not available i get the same message in WSL version 2 too.

atiderko commented 1 year ago

Hi Meriç,

the first thing I would do is look at the log output from master_discovery and see if there are any warnings. Because of "ERROR: Service [/master_discovery/list_members] is not available" it does not seem to have started properly.

Otherwise ROS must work without multimaster. To do this, you can set the ROS_MASTER_URI to the same host on both machines. Only when talker (from ROS examples) on one machine and listener on the other machine are working, you can try to set up multimaster.

PS: Feel free to post the output of master_dsicovery here if you get stuck.

regards Alex

mericgeren commented 1 year ago

Hi Alex,

Here is the output of my launch on the computer using Linux:

Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://pc-linux:45989/

SUMMARY
========

PARAMETERS
 * /master_discovery/listen_mcast: False
 * /master_discovery/robot_hosts: ['192.XXX.XXX.AB']
 * /master_discovery/send_mcast: False
 * /master_sync/robot_hosts: ['192.XXX.XXX.AB']
 * /rosdistro: melodic
 * /rosversion: 1.14.13

NODES
  /
    master_discovery (fkie_master_discovery/master_discovery)
    master_sync (fkie_master_sync/master_sync)

ROS_MASTER_URI=http://pc-linux:11311

process[master_discovery-1]: started with pid [6192]
process[master_sync-2]: started with pid [6193]
[WARN] [1685098823.919693]: Send multicast is disabled.
[WARN] [1685098823.920413]: Listen to multicast is disabled.
[WARN] [1685098823.936876]: Multicast disabled! This master is only by unicast reachable!
[WARN] [1685098879.123173]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out
[WARN] [1685098894.136265]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out
[WARN] [1685098909.155929]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out
[WARN] [1685098924.168734]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out
[WARN] [1685098939.186349]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out
[WARN] [1685098954.202926]: can't retrieve connection information using XMLRPC from [http://192.XXX.XXX.AB:11611], socket error: timed out

This is the output of same launch on the computer using WSL:

Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://laptop-wsl:50889/

SUMMARY
========

PARAMETERS
 * /master_discovery/listen_mcast: False
 * /master_discovery/robot_hosts: ['192.XXX.XXX.CD']
 * /master_discovery/send_mcast: False
 * /master_sync/robot_hosts: ['192.XXX.XXX.CD']
 * /rosdistro: noetic
 * /rosversion: 1.16.0

NODES
  /
    master_discovery (fkie_master_discovery/master_discovery)
    master_sync (fkie_master_sync/master_sync)

ROS_MASTER_URI=http://laptop-wsl:11311

process[master_discovery-1]: started with pid [6192]
process[master_sync-2]: started with pid [6193]
[WARN] [1685098823.919693]: Send multicast is disabled.
[WARN] [1685098823.920413]: Listen to multicast is disabled.
[WARN] [1685098823.936876]: Multicast disabled! This master is only by unicast reachable!

This is the output of rosservice call /master_discovery/list_master on pc-linux:


masters: 
  - 
    name: "pc-linux"
    uri: "http://pc-linux:11311/"
    last_change: 
      secs: 1685098137
      nsecs: 729595661
    last_change_local: 
      secs: 1685098137
      nsecs: 729595661
    online: True
    discoverer_name: "/master_discovery"
    monitoruri: "http://localhost:11611"

As you can see it can't see the master on laptop-wsl again. (Currently using WSL version 1) Finally, this is the output of rosservice call /master_discovery/list_master on laptop-wsl:

ERROR: Service [/master_discovery/list_masters] is not available.

Thank you for all the help you have offered,

Meriç

atiderko commented 1 year ago

It seems that the laptop-wsl blocks all incoming connections. This prevents the master on the pc-linux from fetching connection information.

The other problem is the service call. Maybe he has something to do with the first problem. Have you tried to call other services from other ROS nodes?

mericgeren commented 1 year ago

I have tried to call other services. This is the output when i call /rosout/get_loggers:

loggers:
  -
    name: "ros"
    level: "info"
  -
    name: "ros.roscpp"
    level: "info"
  -
    name: "ros.roscpp.roscpp_internal"
    level: "info"
  -
    name: "ros.roscpp.roscpp_internal.connections"
    level: "info"
  -
    name: "ros.roscpp.superdebug"
    level: "warn"
  -
    name: "ros.rosout"
    level: "info"

This is the output when i call rosservice call /master_dicovery/list_masters after a waiting for a while:

Traceback (most recent call last):
  File "/opt/ros/noetic/bin/rosservice", line 35, in <module>
    rosservice.rosservicemain()
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 767, in rosservicemain
    _rosservice_cmd_call(argv)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 614, in _rosservice_cmd_call
    service_class = get_service_class_by_name(service_name)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 373, in get_service_class_by_name
    service_type = get_service_type(service_name)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 147, in get_service_type
    return get_service_headers(service_name, service_uri).get('type', None)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 124, in get_service_headers
    return rosgraph.network.read_ros_handshake_header(s, BufferType(), 2048)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosgraph/network.py", line 359, in read_ros_handshake_header
    raise ROSHandshakeException("connection from sender terminated before handshake header received. %s bytes were received. Please check sender for additional details."%b.tell())
rosgraph.network.ROSHandshakeException: connection from sender terminated before handshake header received. 0 bytes were received. Please check sender for additional details.

This is the result of calling /master_discovery/get_loggers:

Traceback (most recent call last):
  File "/opt/ros/noetic/bin/rosservice", line 35, in <module>
    rosservice.rosservicemain()
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 767, in rosservicemain
    _rosservice_cmd_call(argv)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 614, in _rosservice_cmd_call
    service_class = get_service_class_by_name(service_name)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 373, in get_service_class_by_name
    service_type = get_service_type(service_name)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 147, in get_service_type
    return get_service_headers(service_name, service_uri).get('type', None)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosservice/__init__.py", line 124, in get_service_headers
    return rosgraph.network.read_ros_handshake_header(s, BufferType(), 2048)
  File "/opt/ros/noetic/lib/python3/dist-packages/rosgraph/network.py", line 359, in read_ros_handshake_header
    raise ROSHandshakeException("connection from sender terminated before handshake header received. %s bytes were received. Please check sender for additional details."%b.tell())
rosgraph.network.ROSHandshakeException: connection from sender terminated before handshake header received. 0 bytes were received. Please check sender for additional details.
mericgeren commented 1 year ago

Thanks for all the help you have offered. I am closing this issue.