fkie / multimaster_fkie

ROS stack with FKIE packages for multi-robot (discovering, synchronizing and management GUI)
BSD 3-Clause "New" or "Revised" License
267 stars 107 forks source link

Master_discovery node appear not to running #130

Closed gzaidner closed 4 years ago

gzaidner commented 4 years ago

Hey, I'm using ROS melodic with Ubuntu 18.04 with fkie 0.8. I have installed and running on one pc with no problems. However, on the other one I get this message: [WARN] [1592421806.360600]: Master_discovery node appear not to running. Wait for topic with type 'MasterState. [WARN] [1592421807.423181]: Master_discovery node appear not to running. Wait for topic with type 'MasterState.

I set up my ROS_MASTER_URI to 192.168.1.78 This is the hosts file:

127.0.0.1   localhost
127.0.1.1   vaultbot-control
192.168.1.74   comp2
192.168.1.78   vaultbot-control

This is rosrun: rosrun master_discovery_fkie master_discovery _mcast_group:=224.0.0.1

The output is: [DEBUG] [1592422243.229692]: init_node, name[/master_discovery], pid[30839] [DEBUG] [1592422243.230400]: binding to 0.0.0.0 0 [DEBUG] [1592422243.230962]: bound to 0.0.0.0 35625 [DEBUG] [1592422243.231642]: ... service URL is rosrpc://192.168.1.78:35625 [DEBUG] [1592422243.232217]: [/master_discovery/get_loggers]: new Service instance [DEBUG] [1592422243.233466]: ... service URL is rosrpc://192.168.1.78:35625 [DEBUG] [1592422243.234021]: [/master_discovery/set_logger_level]: new Service instance [INFO] [1592422243.238744]: ROS Master URI: http://192.168.1.78:11311

[INFO] [1592422243.245557]: Robot hosts: [] [INFO] [1592422243.246093]: Approx. mininum avg. network load: 1.36 bytes/s [INFO] [1592422243.249708]: Start RPC-XML Server at ('0.0.0.0', 11611) [INFO] [1592422243.250561]: Subscribe to parameter /roslaunch/uris [INFO] [1592422243.255989]: + Bind to specified unicast socket @(192.168.1.78:11511) [DEBUG] [1592422243.256940]: Ucast bind to: (192.168.1.78:11511) [DEBUG] [1592422243.257437]: mgroup: 224.0.0.1 [DEBUG] [1592422243.257920]: interface : 192.168.1.78

[INFO] [1592422243.258887]: Create multicast socket at ('224.0.0.1', 11511) [DEBUG] [1592422243.260384]: node[/master_discovery, http://192.168.1.78:38187/] entering spin(), pid[30839] [DEBUG] [1592422243.398935]: MasterMonitor[/r2f_gripper]: can't get PID: [Errno 111] Connection refused [DEBUG] [1592422243.412699]: Send current state to group 224.0.0.1:11511 [DEBUG] [1592422243.428641]: Received a NEW heartbeat from ('localhost', 11511) via MULTICAST socket [INFO] [1592422243.429600]: Detected master discovery: http://localhost:11611 [DEBUG] [1592422243.530657]: Get additional connection info from http://localhost:11611 [DEBUG] [1592422243.533368]: Got [1592422243.360436916, http://vaultbot-control:11311/, vaultbot-control, /master_discovery] from http://192.168.1.78:11611/ [INFO] [1592422243.534401]: Added master with ROS_MASTER_URI=http://vaultbot-control:11311/ [DEBUG] [1592422243.535425]: ... service URL is rosrpc://192.168.1.78:35625 [DEBUG] [1592422243.536368]: [/master_discovery/list_masters]: new Service instance [DEBUG] [1592422243.538509]: ... service URL is rosrpc://192.168.1.78:35625 [DEBUG] [1592422243.539401]: [/master_discovery/refresh]: new Service instance [DEBUG] [1592422244.261251]: Send current state to group 224.0.0.1:11511 [DEBUG] [1592422244.263012]: Send requests while init 1/3 [DEBUG] [1592422244.264349]: Send request to mcast group 224.0.0.1:11511 [DEBUG] [1592422244.265523]: Set timer to send heartbeat in 50.00 sec [DEBUG] [1592422244.294995]: Received a heartbeat from ('localhost', 11511) via MULTICAST socket [DEBUG] [1592422244.296211]: Received a multicast request for a state update from localhost [DEBUG] [1592422244.297223]: Send current state to group 224.0.0.1:11511 [DEBUG] [1592422244.305525]: Received a heartbeat from ('localhost', 11511) via MULTICAST socket [DEBUG] [1592422244.421008]: Received a NEW heartbeat from ('192.168.1.74', 11511) via MULTICAST socket [INFO] [1592422244.426397]: Detected master discovery: http://192.168.1.74:11611 [DEBUG] [1592422244.447618]: Send current state to group 224.0.0.1:11511 [DEBUG] [1592422244.498610]: Received a heartbeat from ('localhost', 11511) via MULTICAST socket [DEBUG] [1592422244.528408]: Get additional connection info from http://192.168.1.74:11611 [DEBUG] [1592422244.532395]: Got [1592421893.200793028, http://192.168.1.74:11311/, 192.168.1.74, /master_discovery] from http://192.168.1.74:11611/ [INFO] [1592422244.533552]: Added master with ROS_MASTER_URI=http://192.168.1.74:11311/ [DEBUG] [1592422247.368242]: Received a heartbeat from ('192.168.1.74', 11511) via MULTICAST socket

rosrun master_sync_fkie master_sync

[INFO] [1592422701.417905]: ignore_hosts: [] [INFO] [1592422701.418821]: sync_hosts: [] [INFO] [1592422701.419678]: sync_topics_on_demand: False [INFO] [1592422701.420509]: resync_on_reconnect: True [INFO] [1592422701.421363]: resync_on_reconnect_timeout: 0 [WARN] [1592422701.427476]: Master_discovery node appear not to running. Wait for topic with type 'MasterState.

gzaidner commented 4 years ago

Also noticed that instead of IP address, it adding master with the computer name, even tough it is IP address in the .bashrc file

atiderko commented 4 years ago

I have heard of such a problem but have never been able to reproduce it. I've added more details to the warning. It would be great if you could take the current version from github and try it out to find the cause of the problem. The version from github also has a parameter 'check_host' for 'master_sync' to switch off the host check: <param name="check_host" value="false" />

Note: By default ROS resolves the address to hostname http://wiki.ros.org/ROS/NetworkSetup#Name_resolution. master_discovery uses getUri() of http://wiki.ros.org/ROS/Master_API to get the current masteruri/address.