christgau / wsdd

A Web Service Discovery host daemon.
MIT License
836 stars 99 forks source link

-H argument doesn't seem to work #3

Closed moonbuggy closed 5 years ago

moonbuggy commented 5 years ago
~$ ./wsdd.py -H 2
Traceback (most recent call last):
  File "/usr/bin/wsdd", line 711, in \<module\>
    main()
  File "/usr/bin/wsdd", line 698, in main
    parse_args()
  File "/usr/bin/wsdd", line 585, in parse_args
    args = parser.parse_args(sys.argv[1:])
  File "/usr/lib/python3.5/argparse.py", line 1735, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/lib/python3.5/argparse.py", line 1767, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib/python3.5/argparse.py", line 1973, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/lib/python3.5/argparse.py", line 1913, in consume_optional
    take_action(action, args, option_string)
  File "/usr/lib/python3.5/argparse.py", line 1841, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/lib/python3.5/argparse.py", line 959, in __call__
    items.append(values)
AttributeError: 'int' object has no attribute 'append'

I don't know python well enough to easily see if this causes problems in functionality elsewhere (maybe you're wanting to append it so you can set a hop limit per interface or something?), but I can make it stop throwing errors at me:

--- wsdd.py
+++ wsdd-fixed.py
@@ -548,7 +548,7 @@
     parser.add_argument(
         '-H', '--hoplimit',
         help='hop limit for multicast packets (default = 1)',
-        action='append', default=1)
+        type=int, default=1)
     parser.add_argument(
         '-u', '--uuid',
         help='UUID for the target device',

If it's at all relevant:

~$ python3 -V
Python 3.7.1
christgau commented 5 years ago

You are right. The -H option is intended to specify the number of allowed hops. Appending does not make sense here (at least this is not the intention). Your proposed fix looks fine. I'll integrate that in one of the next commits. Thanks for looking into this.

btw: Did the increased hop limit work in your network setup? My environment does not provide the possibility to test it, but I thought it would be a nice feature for users to have control over the number of allowed hops in case of a larger infrastructure with multiple levels of routers

moonbuggy commented 5 years ago

Nope, it didn't seem functional. I wasn't sure if that may have been because I broke it with my modification.

My setup might be the problem. I haven't had a chance to double check I'm not filtering the packets with iptables somewhere along the line. I have a ZeroTier network in between the physical LANs I'm trying to broadcast between as well, which doesn't help. Like this:

  192.168.1.0/24     <->     192.168.100.0/24     <->     192.168.2.0/24
  (physical LAN 1)        (virtual ZeroTier LAN)         (physical LAN 2)

LAN 1's router had your implementation of wsdd running with -H 3, LAN 2's router had another implementation (whatever the Entware package supplied) running, which didn't let me set a hoplimit, but I had a server on LAN 2 running your implementation with -H 3. Both LAN routers had wsdd listening on all appropriate interfaces (and I assume also broadcasting on those interfaces).

Nothing detected by Windows machines on either LAN. (Nothing from the other LAN, I mean. It works just fine if I don't want to cross subnets.)

That's not an ideal test setup though. I was planning to take another look at it (probably at the end of this week, otherwise maybe some time next week) with some logging rules set up in iptables to see if I can find out where the packets go missing. I might have a play with the code a bit and suppress the "ValueError: invalid resolve request" messages and maybe add some debug output specific to the hoplimit, to help me focus in on what I want to see. I was also contemplating throwing a temporary second physical LAN together here so I can remove ZeroTier from the mix and see if -H 2 is any better than -H 3.

Basically, I have a few ideas running around in my head and I'm hoping to try at least some of them out in the not too distant future. I can't guarantee I'll get around to it as soon as I hope to, but I'll let you know if and when I make any progress.

christgau commented 5 years ago

Well, simplifying the setup appears to be a good idea. Personally, I would start with a single router and set the hop limit to 2. Note that wsdd is intended to run on a network host that offers Samba (or similar service) but not to be executed on a router without such a service. Therefore I'd put wsdd on a host A in LAN 1 (say on 192.168.1.11) and a client on host B in LAN 2 (say on 192.168.2.22). On both A and B I would run tcpdump or wireshark and see if there is multicast traffic going out and in respectively. If that is the case, that would be fine. The multicast address used for that is "site"-only, but the definition of "site" appear to be vague. Actually I don't know what happens with multicast traffic when it crosses the network boundaries on actual hardware.

Note that multicast traffic is only one half of the story. To make wsdd work B must be able to contact A via HTTP (on port 5357).

christgau commented 5 years ago

0bf314c224c8a47d1993af1538e3008a2cc99cab actually fixes the issue with the command line option. The commit message unfortunately references the wrong issue.

moonbuggy commented 5 years ago

The routers are both running samba. I have wsdd running on a machine that isn't running samba, btw, and it happily announces its existence to Windows machines on the LAN. Windows just complains if you double click the icon in Network Neighbourhood because there's nothing for it to connect to.

I don't have a linux machine, other than the router, on the distant subnet so, short of setting up a second physical LAN here (which has some problems of its own), I'm obliged to test with wsdd on at least one of the routers. It makes sense to me to run it on routers for testing anyway, because I'm then broadcasting directly to the virtual LAN and removing a hop. (In theory -H 2 should have been adequate in that case, I kept it at -H 3 because I figured it wouldn't hurt.)

The fact that a response is required after the broadcast is one thing I was planning to double check in iptables. My understanding is that the UDP broadcast won't set up a RELATED association in iptables for the response. I've got looser than sensible rules on the virtual interfaces, so that shouldn't be an issue there, with wsdd on the router and bound to that interface, but I need to confirm I'm fowarding everything as necessary between the virtual and physical (in both directions).

christgau commented 5 years ago

The routers are both running samba.

Ok. no problem with that.

I have wsdd running on a machine that isn't running samba, btw, and it happily announces its existence to Windows machines on the LAN. Windows just complains if you double click the icon in Network Neighbourhood because there's nothing for it to connect to.

Yes, of course. Technically, you don't need Samba (or another CIFS/SMB implementation) but it makes sense to have it on the machine ;-)

The fact that a response is required after the broadcast is one thing I was planning to double check in iptables. My understanding is that the UDP broadcast won't set up a RELATED association in iptables for the response.

Nitpicking: As required by the spec, wsdd uses multicasts, not broadcasts. Aside that, I'm with you, that the UDP multicast traffic will not add any relation to the connection tracking of a packet-based firewall. But as you note, the firewall rules might be crucial here to get wsdd working across different (site-local) networks or routers. Looking forward to the results of your experiments. Please let us know about your progress.

moonbuggy commented 5 years ago

Nitpicking: As required by the spec, wsdd uses multicasts, not broadcasts.

Fair point. :) Unicast is normally what I'm playing with, I don't have the terminology down because it's rarely relevant for me.

Looking forward to the results of your experiments. Please let us know about your progress.

I don't play with Github much either, but I might fork your code and, if it turns out I need to make substantial changes along the way to get hoplimit working, I'm under the impression that a pull request is probably a convenient way to update you?

If I don't make a substantial change, or if you'd just prefer it, updating within this issue isn't a problem.

christgau commented 5 years ago

I'm under the impression that a pull request is probably a convenient way to update you?

As stated in the README, (pep8-compliant) contributions are welcome. But actually, I'm afraid that there is (maybe) little you can do about the multicast stuff besides setting the hop count and other socket options like interface binding (refer to man 7 ip/ipv6), but let's see. If my time allows, I may setup a multi-router environment as well and play around a little bit.

moonbuggy commented 5 years ago

I've done a little bit of testing today. At the moment it's looking like your implementation isn't binding to interfaces properly.

The setup is a router with LAN interface br0 (192.168.30.1), ZeroTier interface zt0 (192.168.250.30), WAN interface eth0 ().

I'm logging (and accepting) all UDP OUTPUT packets on port 3702 with iptables on that router.

So, start the daemon up:

$ ./wsdd.py -i zt0 -H 2 -v
2019-02-07 06:35:59,628:wsdd INFO(pid 8614): using pre-defined UUID 78cc342f-c269-5c41-99df-37dbb2c602e1
2019-02-07 06:35:59,659:wsdd INFO(pid 8614): joined multicast group ('239.255.255.250', 3702) on 192.168.250.30%zt0

iptables logs traffic on the WAN interface:

Feb  7 17:36:00 kernel: IPTABLESOutAccept: IN= OUT=eth0 SRC=<PUBLIC_IP> DST=239.255.255.250 LEN=1163 TOS=0x00 PREC=0x00 TTL=2 ID=0 DF PROTO=UDP SPT=47723 DPT=3702 LEN=1143

..with nothing on any other interface.

tcpdump shows no activity on the ZeroTier interface with the command:

$ tcpdump -i zt0 port 3702 -vv

When I explicitly tell your implementation to use the LAN interface (br0):

$ ./wsdd.py -i br0 -H 2 -v
2019-02-07 06:52:05,389:wsdd INFO(pid 9652): using pre-defined UUID 78cc342f-c269-5c41-99df-37dbb2c602e1
2019-02-07 06:52:05,418:wsdd INFO(pid 9652): joined multicast group ('239.255.255.250', 3702) on 192.168.30.1%br0

..it still tries to use the WAN interface:

Feb  7 17:52:05 kernel: IPTABLESOutAccept: IN= OUT=eth0 SRC=<PUBLIC_IP> DST=239.255.255.250 LEN=1120 TOS=0x00 PREC=0x00 TTL=1 ID=0 DF PROTO=UDP SPT=3702 DPT=3702 LEN=1100

..with tcpdump showing nothing on the br0 interface. So this does not appear to be a problem isolated to the ZeroTier interface.

In comparison the other implementation of wsdd I have at hand (this, I believe) doesn't let me specify an interface but appears to send data out the LAN interface as expected:

Feb  7 17:46:24 kernel: IPTABLESOutAccept: IN= OUT=br0 SRC=192.168.30.1 DST=192.168.30.30 LEN=1370 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=3702 DPT=64551 LEN=1350

Although that implementation doesn't appear to be making use of the zt0 interface either, so there's a common failure between both implementations on the ZeroTier virtual interface which I'll need to dig into further at some stage.

With your implementation running on another machine on the LAN with only the one physical interface:

$ ./wsdd-mine.py -i enp6s0 -H 2 -v
2019-02-07 18:16:08,528:wsdd INFO(pid 31686): using pre-defined UUID b53a1b52-c930-5cc6-89ec-23c8c25d033a
2019-02-07 18:16:08,534:wsdd INFO(pid 31686): joined multicast group ('239.255.255.250', 3702) on 192.168.30.50%enp6s0

..it is detected as I expect by the router running tcpdump (with a ttl matching the hoplimit setting):

$ tcpdump -i br0 port 3702 -vv
tcpdump: listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:16:08.556614 IP (tos 0x0, ttl 2, id 51984, offset 0, flags [DF], proto UDP (17), length 1166)
    192.168.30.50.56261 > 239.255.255.250.3702: [udp sum ok] UDP, length 1138

..although even with the ttl set to 2 I'm not seeing any packets forwarded to the ZeroTier LAN (I've had tcpdump running on the ZeroTier interface of the remote router through all of this, it's not seen a single incoming packet on port 3702).

This is about all the time I have to play with it today. I'll have to have a think about what my next steps are. My plan had been to start by broadcast directly onto the ZeroTier virtual LAN and rule out packet filtering on that LAN or on the remote router as being the problem, then worry about the hoplimit, but I've not managed to get even that far.

At the moment I'm assuming I'm probably not forwarding packets to the ZeroTier LAN as I should be, because the failure to see packets arrive on the remote router in that last case (wsdd running on the machine on the LAN) should be independent of whichever wsdd implementation I'm running not binding to the zt0 interface when run on the router. Being unable to bind to that interface on the router complicates troubleshooting the packet forwarding issue though, because it forces me to confound it with an untested hoplimit feature possibly not working as expected.

christgau commented 5 years ago

Thanks for your research.

..it is detected as I expect by the router running tcpdump (with a ttl matching the hoplimit setting):

So, technically, the issue is solved, but multicast does not work as expected in your setup. So I'll keep that issue open.

Could you give the issue_3-multicast-if branch a try? It may solve the issue, but it is untested.

moonbuggy commented 5 years ago

Yeah, hoplimit seems to be working. I can open the interface binding as a different issue if you prefer. Whatever works best for you is fine.

The multicast-if branch isn't functional for me at all. It doesn't matter what arguments I use, result is the same:

$ ./wsdd-multicast-if.py
2019-02-07 10:25:47,964:wsdd WARNING(pid 22441): no interface given, using all interfaces
Traceback (most recent call last):
  File "./wsdd-multicast-if.py", line 716, in <module>
    main()
  File "./wsdd-multicast-if.py", line 711, in main
    serve_wsd_requests(addresses)
  File "./wsdd-multicast-if.py", line 662, in serve_wsd_requests
    interface = MulticastInterface(address[1], address[2], address[0])
  File "./wsdd-multicast-if.py", line 81, in __init__
    self.init_v4()
  File "./wsdd-multicast-if.py", line 131, in init_v4
    socket.IPPROTO_IP, socket.IP_MULTICAST_IF, mreq)
OSError: [Errno 99] Cannot assign requested address
christgau commented 5 years ago

Yeah, hoplimit seems to be working. I can open the interface binding as a different issue if you prefer.

It's ok to keep the discussion here.

The multicast-if branch isn't functional for me at all. [...]

What OS are you using (uname -a)? I have no problems running the version from the branch on Linux 4.12 and 4.20 with uclibc-ng 1.0.31 and glibc 2.28, respectively.

In addition, can you add a debug/print statement that outputs information of the interface object, like print(self.interface, self.address, mreq) just before line 131 and post the output if possible (output leaks your IPs)? Maybe it is an issue with your router's interfaces.

moonbuggy commented 5 years ago

The debug output is:

eth0:0 192.168.100.2 b'\xef\xff\xff\xfa\xc0\xa8d\x02'

eth0:0 is a virtual interface that lets me access the web GUI of the modem on the other side of the router, if that's relevant at all.

$ uname -a
Linux moonlink 2.6.36.4brcmarm #1 SMP PREEMPT Sat Feb 2 13:26:57 EST 2019 armv7l ASUSWRT-Merlin

I don't have any leeway as far as the kernel goes. The firmware is a bare minimum sort of deal, to conserve resources on the router.

moonbuggy commented 5 years ago

A further update. It appears I'm not filtering any packets on the ZeroTier LAN. I've got a UDP multicast relay running now, and I'm seeing packets move across that subnet.

I'm now assuming that if I'm losing the packets with the increased ttl from '-H' due to iptables rules it's a forwarding issue between interfaces, rather than an explicit drop/deny firewall deal.

christgau commented 5 years ago

OSError: [Errno 99] Cannot assign requested address

I digged a little deeper in that matter. The error code EADDRNOTAVAIL (99) is raised by the Linux kernel in both old and recent versions only when the kernel is not able to find the interface that matches the address (or index) provided by the user-space application. In the branch, only the address was provided and an index was missing. I added the interface index in the recent commit just to be sure. With and even without that commit wsdd runs fine on my machines. I also ran the test with a virtual interface which I had added using ifconfig eth1:0 192.168.0.1. It all worked fine and I can't reproduce the error.

Notwithstanding, the missing index when using IP_MULTICAST_IF in setsockopt appears to be a bug to me. The FreeBSD ip man page, however, discourages the use of IP_MULTICAST_IF which was introduced by the issue branch. So, it might be not useful to use that socket option at all. What's missing in the IPv4 case is binding the sending socket to an interface. That can be the actual reason why you see multicast traffic on the wrong interface.

christgau commented 5 years ago

@moonbuggy could you please check, if 384d61e works for you, i.e. if the multicast packets leave on the correct interface?

moonbuggy commented 5 years ago

Look like it binds to interfaces properly now. Good stuff. :)

No arguments and I see traffic on the zt0 and btr0 interfaces, '-i zt0' and I see traffic on the zt0 interface and none on the br0 interface, '-i br0' and I see traffic on the br0 interface and none on the zt0 interface.

It looks like the major issues I've had with your implementation of wsdd have been resolved. I still need to figure out why -H doesn't seem to let me cross subnets. I'm hoping it's something I can fix with iptables, but I'm wondering if the increased ttl value isn't enough on its own (without some sort of active forwarding or relaying) to achieve the result I'm after.

Now I'm binding to interfaces nicely the testing is simpler. I'll likely have time in the next day or two to remote in to the distant LAN and see how it looks, and let you know if you're interested.

christgau commented 5 years ago

Look like it binds to interfaces properly now. Good stuff. :)

No arguments and I see traffic on the zt0 and btr0 interfaces, '-i zt0' and I see traffic on the zt0 interface and none on the br0 interface, '-i br0' and I see traffic on the br0 interface and none on the zt0 interface.

Nice! However, I am not really happy with this commit because - as mentioned before - the FreeBSD man page states (emphasizing done by me)

The use of IP_MULTICAST_IF is not recommended, as multicast memberships are scoped to each individual interface. It is supported for legacy use [...]

Yesterday, I was able to reproduce your issue with a multi-homed Linux device, i.e. multicast traffic comes out of the wrong interface. I modified the code another way and as I suggested in https://github.com/christgau/wsdd/issues/3#issuecomment-461905703 it seems that a missing bind for IPv4 interfaces caused the problem. Multicast traffic is then emitted on the default (multicast-capable) interface - whichever that is. binding solves the problem on Linux and I assume the same applies for FreeBSD. However, I am going to test this.

Concerning your real problem (having the multicast traffic on different networks) I assume you need a (software) multicast router. You may find http://troglobit.com/howto/multicast/ helpful.

moonbuggy commented 5 years ago

I'd be happy to run your new code on my device when the code is available to confirm it works for me on k2.6 as well, although I assume it will from what you've said.

Concerning your real problem (having the multicast traffic on different networks) I assume you need a (software) multicast router. You may find http://troglobit.com/howto/multicast/ helpful.

Thanks for the link. I'll have a look at it in a day or two. I've been looking at tcpdump and wsdd.py -vv output too long today already and it's turned my brain to mush. :) I've got a few other things I need to get back to, I've been putting too much time into making icons appear in Network Neighbourhood, really just because I'm annoyed Microsoft broke it, not due to any sort of critical importance.

Still not having any luck with the -H argument in wsdd getting things across subnets, btw. I am getting HELLO messages originating from Windows across subnets via a UDP multicast relay, but I seem to be losing responses (possibly related to to invalid resolve requests). Packets originating from the two wsdd implementations I've been fiddling with don't seem to want to be relayed at all, regardless of TTL.

I need to understand the protocol better, when I have more time to look at it and a clearer head. That link you provided looks like it will help in that regard. Thanks again.

christgau commented 5 years ago

The recent commit in master solves the issue for both FreeBSD and Linux. Tested on FreeBSD 12, Gentoo and Fedora 29 (workstation).

@moonbuggy As a remark on your routing problem: From what you have posted before, I assume you use IPv4 only. If you are intending to use IPv6 you may need to change the multicast address in the code to a site-local (or "higher") scope.