fkie / multimaster_fkie

ROS stack with FKIE packages for multi-robot (discovering, synchronizing and management GUI)
BSD 3-Clause "New" or "Revised" License
268 stars 108 forks source link

zeroconf separate network problem #38

Closed andrejpan closed 8 years ago

andrejpan commented 8 years ago

Hey,

I am using Ubuntu 14.04 and Indigo ROS on Intel based pc (hostname atbeetz5). I run separately roscore and node_manager. Faculty IT department updated firmware on the network switches and master_discovery is not working anymore. With zeroconf discovery I also have problems, because I can not have separate network for my project.

I establish connection like this: zeroconf1

But I can immediately see other nodes and topics which should be in the network 0. zeroconf2

Any idea what I am doing wrong or this could be a bug?

atiderko commented 8 years ago

Hi @andrejpan,

zeroconf disocvery does not support network separation in the current implementation. I will try to add this feature in the next time. In the meantime you can try to use master_discovery and add hosts to discover in Robot hosts parameter. These hosts are then pinged using unicast communication. I think your multicast communication is blocked since the firmware update of your switches.

andrejpan commented 8 years ago

Hey, thank you for informations. Did I missed somewhere that zeroconf disocvery does not support network separation? Blocking an input of a network number parameter would be nice or short massage around the input when discovery type selection is changed.

I found out that I must put two IP addresses inside Robot hosts parameter (one from PC which runs node_manager and one from PC that I want to add new ROS node).

On LAN wiring master_discovery works, but wifi setting are a bit broken and apparently I can just wait for new update.

atiderko commented 8 years ago

Hi,

you are right, it seems to be uncommented feature.

It's strange that you need to define two addresses. What happens if you put only the address of the remote host?

What is about wifi? Is it not configured? Or does it not work with master_discovery?

andrejpan commented 8 years ago

If I only put an address of remote host I do not see remote host at node_manager. Remote host:

pixuser@pixdrone1:~$ rostopic echo /master_discovery/changes 
state: 'changed'
master: 
  name: pixdrone1
  uri: http://pixdrone1:11311/
  timestamp: 1467704163.24
  timestamp_local: 1467704163.24
  online: True
  discoverer_name: /master_discovery
  monitoruri: http://172.24.16.156:11611
---
pixuser@pixdrone1:~$ rostopic echo /master_discovery/linkstats 
header: 
  seq: 1
  stamp: 
    secs: 1467704186
    nsecs: 496687889
  frame_id: pixdrone1
links: 
  - 
    destination: pixdrone1
    quality: 100.0
---

Local host:

pangerca@atbeetz5:~$ rostopic echo /master_discovery/changes 
state: 'changed'
master: 
  name: atbeetz5
  uri: http://atbeetz5:11311/
  timestamp: 1467705390.97
  timestamp_local: 1467705390.97
  online: True
  discoverer_name: /master_discovery
  monitoruri: http://172.24.16.36:11611
---
pangerca@atbeetz5:~$ rostopic echo /master_discovery/linkstats 
header: 
  seq: 129
  stamp: 
    secs: 1467705406
    nsecs: 542614936
  frame_id: atbeetz5
links: 
  - 
    destination: atbeetz5
    quality: 100.0
---

When I put both ip addresses I see this:

pixuser@pixdrone1:~$ rostopic echo /master_discovery/changes 
state: 'changed'
master: 
  name: pixdrone1
  uri: http://pixdrone1:11311/
  timestamp: 1467704299.91
  timestamp_local: 1467704299.91
  online: True
  discoverer_name: /master_discovery
  monitoruri: http://172.24.16.156:11611
---
pixuser@pixdrone1:~$ rostopic echo /master_discovery/linkstats 
header: 
  seq: 1
  stamp: 
    secs: 1467704321
    nsecs: 522756099
  frame_id: pixdrone1
links: 
  - 
    destination: pixdrone1
    quality: 100.0
  - 
    destination: atbeetz5
    quality: 100.0
---

and local host:

pangerca@atbeetz5:~$ rostopic echo /master_discovery/changes 
state: 'changed'
master: 
  name: atbeetz5
  uri: http://atbeetz5:11311/
  timestamp: 1467706420.22
  timestamp_local: 1467706420.22
  online: True
  discoverer_name: /master_discovery
  monitoruri: http://172.24.16.36:11611
---
pangerca@atbeetz5:~$ rostopic echo /master_discovery/linkstats 
header: 
  seq: 1015
  stamp: 
    secs: 1467706294
    nsecs: 578988075
  frame_id: atbeetz5
links: 
  - 
    destination: atbeetz5
    quality: 100.0
---

Network is faculty(TUM informatics) based and we have a virtual network on this infrastructure. master_discovery apparently was working until they updated wifi firmware some months ago (that's how people told me). Is there a tool that I could get all the parameters of the network and see what could be wrong? Otherwise I tested udp multicast with these two commands and nothing came through.

iperf -s -u -B 226.0.0.0 -i 1
iperf -c 226.0.0.0 -u -T 32 -t 3 -i 1
atiderko commented 8 years ago

Did you started the master_discovery as follow: start_discovery

Can you add also the console output of the master_discovery?

In our department we have also problems with multicast in combination with virtual networks. For this cases we added the unicast solution with Robot hosts.

I know of no universal network diagnostic tool. There are a lot of and each for a special case...

atiderko commented 8 years ago

I added the separation functionality to zeroconf. Please try it.

andrejpan commented 8 years ago

Your solution works, thank you! I did not put pixdrone1 to Robot hosts when was putting up atbeetz5(localhost). I guess if I would have 3 computers, I would need to put 2 other names at every discovery?

Regarding new functionality with zeroconf, it also is working, I am not seeing other computers anymore, just at the left upper part it is written: ROS Network[id: network number]. The network number is not correct. I have a feeling that it is from previous network number (if it was different then current one). My previous network number was 99 and now is 9, but at the top I still see [id: 99].

Here also console log from the updated package:

12:41 $ rosrun node_manager_fkie node_manager
[DEBUG] [WallTime: 1467801748.822195] init_node, name[/node_manager], pid[1740]
[DEBUG] [WallTime: 1467801748.828096] binding to 0.0.0.0 0
[DEBUG] [WallTime: 1467801748.832261] bound to 0.0.0.0 36241
[DEBUG] [WallTime: 1467801748.836896] ... service URL is rosrpc://atbeetz5:36241
[DEBUG] [WallTime: 1467801748.841462] [/node_manager/get_loggers]: new Service instance
[DEBUG] [WallTime: 1467801748.850990] ... service URL is rosrpc://atbeetz5:36241
[DEBUG] [WallTime: 1467801748.854655] [/node_manager/set_logger_level]: new Service instance
[INFO] [WallTime: 1467801749.114548] listen for logs on /rosout
[DEBUG] [WallTime: 1467801749.173459] connecting to atbeetz5 36241
[INFO] [WallTime: 1467801749.418955] Start RPC-XML Server at ('0.0.0.0', 22622)
[INFO] [WallTime: 1467801749.423946] Subscribe to parameter `/roslaunch/uris`
[DEBUG] [WallTime: 1467801751.567860] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
Bus::open: Can not get ibus-daemon's address. 
IBusInputContext::createInputContext: no connection to ibus-daemon 
[INFO] [WallTime: 1467801766.947919] Run without config: /usr/bin/screen -c /usr/stud/pangerca/.ros/log/_zeroconf.conf -L -dmS _zeroconf /work/pangerca/catkin_ws/src/multimaster_fkie/master_discovery_fkie/nodes/zeroconf _mcast_port:=11520 _mcast_group:=226.0.0.0 _robot_hosts:=[] _heartbeat_hz:=0.5 __name:=zeroconf
[DEBUG] [WallTime: 1467801767.524386] connecting to atbeetz5 36497
[DEBUG] [WallTime: 1467801768.418972] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[DEBUG] [WallTime: 1467801769.371841] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[DEBUG] [WallTime: 1467801769.388181] service 'list_masters' found on http://atbeetz5:11311/ as /zeroconf/list_masters
[DEBUG] [WallTime: 1467801769.393621] connecting to atbeetz5 36497
[INFO] [WallTime: 1467801769.407075] listen for updates on /zeroconf/changes
[INFO] [WallTime: 1467801769.426477] listen for connection statistics on /zeroconf/linkstats
[DEBUG] [WallTime: 1467801769.444429] connecting to atbeetz5 36497
[DEBUG] [WallTime: 1467801769.447500] connecting to atbeetz5 36497
[DEBUG] [WallTime: 1467801770.295037] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[DEBUG] [WallTime: 1467801771.319526] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[DEBUG] [WallTime: 1467801780.427255] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[INFO] [WallTime: 1467801788.704359] Run remote on pixdrone1: rosrun node_manager_fkie remote_nm.py --package master_discovery_fkie --node_type zeroconf --node_name /zeroconf _mcast_port:=11520 _mcast_group:=226.0.0.0 _robot_hosts:=[] _heartbeat_hz:=0.5 __name:=zeroconf
[INFO] [WallTime: 1467801789.019176] REMOTE execute on pixuser@pixdrone1: rosrun node_manager_fkie remote_nm.py --package master_discovery_fkie --node_type zeroconf --node_name /zeroconf _mcast_port:=11520 _mcast_group:=226.0.0.0 _robot_hosts:=[] _heartbeat_hz:=0.5 __name:=zeroconf
[DEBUG] [WallTime: 1467801794.146949] STDOUT while start 'zeroconf': run on remote host: /usr/bin/screen -c /home/pixuser/.ros/log/_zeroconf.conf -L -dmS _zeroconf   /home/pixuser/catkin_ws/src/multimaster_fkie/master_discovery_fkie/nodes/zeroconf _mcast_port:=11520 _mcast_group:=226.0.0.0 _robot_hosts:=[] _heartbeat_hz:=0.5 __name:=zeroconf

[DEBUG] [WallTime: 1467801797.387653] MASTERINFO from pixdrone1 (http://pixdrone1:11311/) received
[DEBUG] [WallTime: 1467801798.898873] MASTERINFO from pixdrone1 (http://pixdrone1:11311/) received
[INFO] [WallTime: 1467801800.731138] Run without config: /usr/bin/screen -c /usr/stud/pangerca/.ros/log/_master_sync.conf -L -dmS _master_sync /work/pangerca/catkin_ws/src/multimaster_fkie/master_sync_fkie/nodes/master_sync _interface_url:='.' _sync_topics_on_demand:=False _ignore_hosts:=[] _sync_hosts:=[] _ignore_nodes:=[] _sync_nodes:=[] _ignore_topics:=[] _sync_topics:=[] _ignore_services:=[] _sync_services:=[] _sync_remote_nodes:=False __name:=master_sync
[DEBUG] [WallTime: 1467801801.420422] connecting to atbeetz5 35410
[INFO] [WallTime: 1467801801.543449] Run remote on pixdrone1: rosrun node_manager_fkie remote_nm.py --package master_sync_fkie --node_type master_sync --node_name /master_sync _interface_url:='.' _sync_topics_on_demand:=False _ignore_hosts:=[] _sync_hosts:=[] _ignore_nodes:=[] _sync_nodes:=[] _ignore_topics:=[] _sync_topics:=[] _ignore_services:=[] _sync_services:=[] _sync_remote_nodes:=False __name:=master_sync --masteruri http://pixdrone1:11311/
[INFO] [WallTime: 1467801801.548350] REMOTE execute on pixuser@pixdrone1: rosrun node_manager_fkie remote_nm.py --package master_sync_fkie --node_type master_sync --node_name /master_sync _interface_url:='.' _sync_topics_on_demand:=False _ignore_hosts:=[] _sync_hosts:=[] _ignore_nodes:=[] _sync_nodes:=[] _ignore_topics:=[] _sync_topics:=[] _ignore_services:=[] _sync_services:=[] _sync_remote_nodes:=False __name:=master_sync --masteruri http://pixdrone1:11311/
[DEBUG] [WallTime: 1467801802.687191] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
[DEBUG] [WallTime: 1467801803.607610] STDOUT while start 'master_sync': run on remote host: /usr/bin/screen -c /home/pixuser/.ros/log/_master_sync.conf -L -dmS _master_sync   /home/pixuser/catkin_ws/src/multimaster_fkie/master_sync_fkie/nodes/master_sync _interface_url:=. _sync_topics_on_demand:=False _ignore_hosts:=[] _sync_hosts:=[] _ignore_nodes:=[] _sync_nodes:=[] _ignore_topics:=[] _sync_topics:=[] _ignore_services:=[] _sync_services:=[] _sync_remote_nodes:=False __name:=master_sync

[DEBUG] [WallTime: 1467801807.144997] MASTERINFO from pixdrone1 (http://pixdrone1:11311/) received
[DEBUG] [WallTime: 1467801811.930658] MASTERINFO from atbeetz5 (http://atbeetz5:11311/) received
atiderko commented 8 years ago

thanks for reporting the problem with network number. It should now be fixed.

exactly, if you have 3 computer, you need to put the other 2 into Robot hosts.