fkie / multimaster_fkie

ROS stack with FKIE packages for multi-robot (discovering, synchronizing and management GUI)
BSD 3-Clause "New" or "Revised" License
272 stars 106 forks source link

Hostname resolution fails using zeroconf #99

Closed stertingen closed 5 years ago

stertingen commented 5 years ago

Problem

Using the zeroconf discovery, SyncThread on local host hostfoo throws errors about being unable to resolve the hostname of a remote host hostbar:

[INFO][rosout]: hostbar is now online
[INFO][rosout]: SyncThread[hostfoo] Requesting remote state from 'http://hostbar:11911/'
[ERROR][rosout]: SyncThread[hostfoo] ERROR: Traceback (most recent call last):
  File "/opt/ros/kinetic/lib/python2.7/dist-packages/master_sync_fkie/sync_thread.py", line 264, in _request_remote_state
    remote_state = remote_monitor.masterInfo()
  File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1311, in single_request
    self.send_content(h, request_body)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1459, in send_content
    connection.endheaders(request_body)
  File "/usr/lib/python2.7/httplib.py", line 1053, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 897, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 859, in send
    self.connect()
  File "/usr/lib/python2.7/httplib.py", line 836, in connect
    self.timeout, self.source_address)
  File "/usr/lib/python2.7/socket.py", line 557, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno -2] Name or service not known

Notes:

Debugging

My /etc/nsswitch.conf (relevant line):

hosts:          files mdns [NOTFOUND=return] dns

The parameters for the zeroconf node are default; the zeroconf nodes discover each other perfectly.

Manual hostname resolution fails with:

host hostbar
host hostbar.local
avahi-resolve -n hostbar
getent hosts hostbar

Manual hostname resolution works with:

avahi-resolve -n hostbar.local
getent hosts hostbar.local

I'm not sure, it might be a better idea to use hostbar.local instead?

atiderko commented 5 years ago

I used socket.gethostname() for hostname detection.

I changed it now and use the hostname from ros masteruri. Please, try again!

stertingen commented 5 years ago

Nope, does not help; still the same error.

stertingen commented 5 years ago

The output from avahi-browse -a -t -r is interesting:

+ enp0s3 IPv6 hostbar                            _ros-master._tcp     local
= enp0s3 IPv6 hostbar                            _ros-master._tcp     local
   hostname = [hostbar.local]
   address = [fd78:9d42:7594::56f]
   port = [11311]
   txt = ["network_id=0" "rpcuri=http://hostbar:11911" "zname=/master_discovery" "master_uri=http://hostbar:11311/" "timestamp_local=1554903303.730091095" "timestamp=1554903303.730091095"]

While the hostname is correcty set to hostbar.local, the values master_uri and rpcuri in the txt array are propably not.

Interesting: setting ROS_MASTER_URI=http://hostbar.local:11311 explicitly on hostbar (analog on hostfoo) does not help either.

stertingen commented 5 years ago

This might be a problem with the ROS Master API (http://wiki.ros.org/ROS/Master_API).

The discovery node calls getUri(), which returns the local hostname without the .local suffix.

I'm not sure whether this is intended behavior; if it is, the discovery node might have to do a workaround for this.

atiderko commented 5 years ago

I added afqdn parameter. You have to set this parameter to true.

rosrun master_discovery_fkie zeroconf _fqdn:=true

Regarding your post, I'm afraid that the ROS-topic communication will not work anyway. But you can try it out!

stertingen commented 5 years ago

Thank you very much!

Well, the discovery works fine now, but since the nodes itself do not use the FQDN, I have to set ROS_HOSTNAME=$(hostname -f) explicitly (see https://github.com/ros/ros_comm/issues/138).

Since I'm bound to an IPv6-only network, there were still a few things missing:

For better IPv6 compatibility it would be nice if the MasterMonitor was initialized in IPv6 mode

I've got a setung running now with:

stertingen commented 5 years ago

Well, it's not that undocumented: http://www.ros.org/news/2012/06/ipv6-support-for-ros-nodes.html