analogdevicesinc / libiio

A cross platform library for interfacing with local and remote Linux IIO devices
http://analogdevicesinc.github.io/libiio/
GNU Lesser General Public License v2.1
493 stars 318 forks source link

IIOD DNS-SD: Only one iio service can be registered on the network #402

Closed matejk closed 4 years ago

matejk commented 4 years ago

With current implementation, where service name and type are hardcoded in iiod, only first service is successfully published with DNS-SD, subsequent fail because the type/name pair collides with the existing.

    if (group && !avahi_entry_group_add_service(group,
            AVAHI_IF_UNSPEC, AVAHI_PROTO_UNSPEC,
            0, "iio", "_iio._tcp", NULL, NULL, IIOD_PORT, NULL)) {
        avahi_entry_group_commit(group);
        INFO("Registered to ZeroConf server %s\n",
                avahi_client_get_version_string(client));
    }

"iio" service name shall be replaced with something that is unique on the network.

I propose using hostname.

rgetz commented 4 years ago

First thanks for the bug report - and the exact thing that is the solution.

Code that you are pointing to : https://github.com/analogdevicesinc/libiio/blob/master/iiod/iiod.c#L127

Upstream Doc says:

I think the issue is we are not checking the return code for (ret == AVAHI_ERR_COLLISION) and then using avahi_alternative_service_name(name);

Which is what the Upstream Example. That might be the more robust solution?

I was trying on two different subnets (192.168.1.x (ZedBoard) and 192.168.2.x (RNDIS on pluto)), which is why I didn't notice this. I will set something up, and see if I can replicate things, and fix it.

matejk commented 4 years ago

I can run at least two IIOD on the same network and can easily test the changes.

FYI: sshd uses hostname as service name.

rgetz commented 4 years ago

hostnames may not be unique on most IIO devices. All FPGA images running the same image - with the same hostname... Same for Pluto (I think).

matejk commented 4 years ago

What about selecting the name in this order?

And use avahi_alternative_service_namefor duplicates in any of the cases.

rgetz commented 4 years ago

I think combining things would quickly fail the string shorter than 63 characters

rgetz@brain:~/github/libini/build$ iio_attr -u usb:3.18.5 -C hw_model
hw_model: Analog Devices PlutoSDR Rev.B (Z7010-AD9364)
rgetz@brain:~/github/libini/build$ echo "Analog Devices PlutoSDR Rev.B (Z7010-AD9364)" | wc -c
45

Let me poke into things - there has to be a standard solution for unique host names for RNDIS devices.

matejk commented 4 years ago

An example of how HP printer publishes itself:

Screen Shot 2020-03-16 at 08 10 22

rgetz commented 4 years ago

Can you try out the branch rgetz-allow-more-than-one-iio-on-avahi pointed to in #406, and see if that fixes things for you?

matejk commented 4 years ago

Publishing in Avahi does not work properly.

When I start one iiod built from PR branch, I get this (expected) output:

$ avahi-browse -tr _iio._tcp
+   ens3 IPv6 iio                                           _iio._tcp            local
+   ens3 IPv4 iio                                           _iio._tcp            local
=   ens3 IPv6 iio                                           _iio._tcp            local
   hostname = [uberscope.local]
   address = [2a00:ee2:6b04:4600:5054:ff:feeb:7ffd]
   port = [30431]
   txt = []
=   ens3 IPv4 iio                                           _iio._tcp            local
   hostname = [uberscope.local]
   address = [192.168.178.53]
   port = [30431]
   txt = []

Then I stopped new iiod and started iiod (version 0.15) on another device and got (expected):

$ avahi-browse -tr _iio._tcp
+   ens3 IPv6 iio                                           _iio._tcp            local
=   ens3 IPv6 iio                                           _iio._tcp            local
   hostname = [trenz.local]
   address = [2a00:ee2:6b04:4600:20a:35ff:fe00:2201]
   port = [30431]
   txt = []

Then new iiod was started again. I should get output from both servers, but I did not:

$ avahi-browse -tr _iio._tcp
+   ens3 IPv6 iio                                           _iio._tcp            local
=   ens3 IPv6 iio                                           _iio._tcp            local
   hostname = [trenz.local]
   address = [2a00:ee2:6b04:4600:20a:35ff:fe00:2201]
   port = [30431]
   txt = []
matejk commented 4 years ago

When I changed the "iiod" service name to "iiod-abcd", I got:

$ avahi-browse -tr _iio._tcp
+   ens3 IPv6 iio-abcd                                      _iio._tcp            local
+   ens3 IPv6 iio                                           _iio._tcp            local
+   ens3 IPv4 iio-abcd                                      _iio._tcp            local
=   ens3 IPv6 iio-abcd                                      _iio._tcp            local
   hostname = [uberscope.local]
   address = [2a00:ee2:6b04:4600:5054:ff:feeb:7ffd]
   port = [30431]
   txt = []
=   ens3 IPv4 iio-abcd                                      _iio._tcp            local
   hostname = [uberscope.local]
   address = [192.168.178.53]
   port = [30431]
   txt = []
=   ens3 IPv6 iio                                           _iio._tcp            local
   hostname = [trenz.local]
   address = [2a00:ee2:6b04:4600:20a:35ff:fe00:2201]
   port = [30431]
   txt = []

Notice that both servers are now listed.

matejk commented 4 years ago

BTW: top level context IIO attributes (including those read from libiio.ini) could be published as TXT records (see example for HP printer above).

What do you think?

rgetz commented 4 years ago

When I changed the "iiod" service name to "iiod-abcd", I got:

just to check, the default service name in the PR is:

    avahi.name = avahi_strdup("iio");

not iiod

what did you change?

rgetz commented 4 years ago

and when I do a:

avahi-browse -tr _iio._tcp
+ enx00e0229bee20 IPv6 brain                                         _iio._tcp            local
+ enx00e0229bee20 IPv4 brain                                         _iio._tcp            local
+   eno1 IPv6 brain                                         _iio._tcp            local
+   eno1 IPv4 analog                                        _iio._tcp            local
+   eno1 IPv4 brain                                         _iio._tcp            local
= enx00e0229bee20 IPv6 brain                                         _iio._tcp            local
   hostname = [brain.local]
   address = [fe80::689c:66a7:fbaa:ce88]
   port = [30431]
   txt = []
=   eno1 IPv6 brain                                         _iio._tcp            local
   hostname = [brain.local]
   address = [fe80::9a90:96ff:feb5:acaa]
   port = [30431]
   txt = []
= enx00e0229bee20 IPv4 brain                                         _iio._tcp            local
   hostname = [brain.local]
   address = [192.168.2.10]
   port = [30431]
   txt = []
=   eno1 IPv4 brain                                         _iio._tcp            local
   hostname = [brain.local]
   address = [192.168.1.114]
   port = [30431]
   txt = []
=   eno1 IPv4 analog                                        _iio._tcp            local
   hostname = [analog.local]
   address = [192.168.1.120]
   port = [30431]
   txt = []

I get host at the end, + enx00e0229bee20 IPv6 brain , you get : + ens3 IPv6 iio service-name at the end?

https://github.com/lathiat/avahi/blob/master/avahi-utils/avahi-browse.c#L322

print_service_line(c, '+', interface, protocol, name, type, domain, 1);

should be printing service name... let me check a few more things....

matejk commented 4 years ago

I ran avahi-browse version 0.6.32-rc on Ubuntu 16.04.6. Maybe the output changed between versions.

matejk commented 4 years ago

Version 0.7 on Ubuntu 19.10 gives the same output.

rgetz commented 4 years ago

I think it's because I have the /etc/avahi/services/iio.service file installed with a:

<name replace-wildcards="yes">%h</name>

which is described in the avahi.service man page as:

 <name  replace-wildcards="yes|no"> The service name. 
         If replace-wildcards is "yes", any occurence of the string "%h" 
         will be replaced by the local host name.
         This can be used for service names like "Remote Terminal on %h".

I will see if I get get things running without that, and see what shows up.

matejk commented 4 years ago

When I changed the "iiod" service name to "iiod-abcd", I got:

just to check, the default service name in the PR is:

  avahi.name = avahi_strdup("iio");

not iiod

what did you change?

Sorry, I mistyped in my comment. It was iiio-abcd as can also be seen in the avahi-browse output.

rgetz commented 4 years ago

Ok - I was able to replicate things - let me know if the update to the PR branch helps.

Thanks for testing.

matejk commented 4 years ago

The implementation is missing one important piece: the poll loop. That's why the code to detect the collision is never called. When the loop is added for service name "iio" (to be able to detect collisions easily), output from avahi-browse looks like this:

$ avahi-browse -tr _iio._tcp
+   ens3 IPv6 iio #2                                        _iio._tcp            local
+   ens3 IPv6 iio                                           _iio._tcp            local
+   ens3 IPv4 iio #2                                        _iio._tcp            local
=   ens3 IPv6 iio #2                                        _iio._tcp            local
   hostname = [uberscope.local]
   address = [2a00:ee2:6b04:4600:5054:ff:feeb:7ffd]
   port = [30431]
   txt = []
=   ens3 IPv6 iio                                           _iio._tcp            local
   hostname = [uberscopes.local]
   address = [2a00:ee2:6b04:4600:26f5:a2ff:fef1:9544]
   port = [30431]
   txt = []
=   ens3 IPv4 iio #2                                        _iio._tcp            local
   hostname = [uberscope.local]
   address = [192.168.178.53]
   port = [30431]
   txt = []

And debugs from iiod:

$ ./iiod/iiod -a -D -d
Starting IIO Daemon version 0.19.v0.19
IPv6 support enabled
DEBUG: Avahi: create services
DEBUG: Avahi: Group uncommited
Registered 'iio' to ZeroConf server avahi 0.6.32-rc
Avahi: Started.
DEBUG: Avahi: Group registering
Avahi: Service name collision, renaming service to 'iio #2'
Registered 'iio #2' to ZeroConf server avahi 0.6.32-rc
DEBUG: Avahi: Group registering
Avahi: Service 'iio #2' successfully established.

I'll push required modifications to a branch created from rgetz-allow-more-than-one-iio-on-avahi. (I got some ideas from https://groups.io/g/pcp/attachment/6037/1/avahi.c).

matejk commented 4 years ago

Fixes pushed to https://github.com/matejk/libiio/tree/gh-402-iiod-avahi-publish-fixes.

You can pull them to your branch and try them.

rgetz commented 4 years ago

I think the latest updates work - thanks for the pointers. When getting collisions, I now get:

 ./iiod/iiod 
DEBUG: Found 3 usable i/o endpoint couples
DEBUG: Couple 0 with endpoints 0x86 / 0x4
DEBUG: Couple 1 with endpoints 0x87 / 0x5
DEBUG: Couple 2 with endpoints 0x88 / 0x6
Starting IIO Daemon version 0.19.5e58241
IPv6 support enabled
DEBUG: Avahi: create services
DEBUG: Avahi: Group uncommited
Avahi: Registered 'iiod' to ZeroConf server avahi 0.7
Avahi: Started.
ERROR: __avahi_group_cb group match
DEBUG: Avahi: Group registering
ERROR: __avahi_group_cb group match
Avahi: Service name collision, renaming service to 'iiod #2'
Avahi: Registered 'iiod #2' to ZeroConf server avahi 0.7
Avahi: Service 'iiod #2' successfully established.
DEBUG: Cleaning up
Avahi: Removing service 'iiod #2'
Avahi: Stopped
rgetz commented 4 years ago

BTW: top level context IIO attributes (including those read from libiio.ini) could be published as TXT records (see example for HP printer above).

What do you think?

Some Doc suggests two things: The intention of DNS-SD TXT records is convey a small amout of useful additional information about a service. Ideally it SHOULD NOT be necessary for a client to retrieve this additional information before it an usefully establish a connection to the service. For a well-designed TCP-based application protocol, it should be possible, knowing only the host name and port number, to open a connection to that listening process, and then perform version- or feature-negotiation to determine the capabilities of the service instance. and The total size of a typical DNS-SD TXT record is intended to be small — 100 bytes or less.

Since all the info is available when you do attach to a context, and even pluto & m2k are over 400+ chars...

rgetz@brain:~/github/libini/build$ iio_attr -u usb:3.24.5 -C | grep -v attributes
hw_model: Analog Devices PlutoSDR Rev.B (Z7010-AD9364)
hw_model_variant: 0
hw_serial: 104473222a87000618000600473ed57ae0
fw_version: v0.31
ad9361-phy,xo_correction: 40000000
ad9361-phy,model: ad9364
local,kernel: 4.14.0-42540-g387d584
usb,idVendor: 0456
usb,idProduct: b673
usb,release: 2.0
usb,vendor: Analog Devices Inc.
usb,product: PlutoSDR (ADALM-PLUTO)
usb,serial: 104473222a87000618000600473ed57ae0
usb,libusb: 1.0.22.11312
rgetz@brain:~/github/libini/build$ iio_attr -u usb:3.24.5 -C | grep -v attributes | wc -c
431

I guess I don't see the point in this much, when you can get this from other / standard ways.

matejk commented 4 years ago

Some Doc suggests two things: The total size of a typical DNS-SD TXT record is intended to be small — 100 bytes or less.

...

I guess I don't see the point in this much, when you can get this from other / standard ways.

I agree. I did not read that part of documentation.

matejk commented 4 years ago

Avahi file iio.service shall be also removed from installation because iiod now publishes itself properly.

rgetz commented 4 years ago

Avahi file iio.service shall be also removed from installation because iiod now publishes itself properly.

Yeah, now that I'm looking, Many application servers have support for zeroconf built in. This means that these services should be announcement automatically when avahi is installed. No static service announcement configuration should be needed then.

I will rm the service file and CMake that installs it as part of this pull request.

rgetz commented 4 years ago

I think that is it - I will close this now. If you see anything else - please let us know.

Thanks for the bug report, testing and code.

matejk commented 4 years ago

Thanks for the bug report, testing and code.

You're welcome. I'm glad I was able to help.