christgau / wsdd

A Web Service Discovery host daemon.
MIT License
848 stars 100 forks source link

Errors if any interface has no addresses #1

Closed SerialVelocity closed 5 years ago

SerialVelocity commented 5 years ago

My eth0 has no address so your enumerate_host_interfaces errored out because it dereferenced a null pointer.

The hack I used to fix this was:

         deref = ptr[0]
+        if deref.addr:
           family = deref.addr[0].family
christgau commented 5 years ago

Thanks for reporting this issue.

However, I am not able to reproduce it based on your description. On a (slow) system with three Ethernet NICs, wsdd runs as expected. Using the interface without IP address (eth2) wsdd refuses to run, but as long as there is an interface that has an IP, there is no problem.

~/src/wsdd/src $ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen     inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 192.x.x.x/24 brd 192.x.x.x scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet6 fe80: [...]
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx  brd ff:ff:ff:ff:ff:ff
    inet 10.x.x.x/16 scope global eth1
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff

~/src/wsdd/src $ ./wsdd.py -i eth2 -vvv
2018-12-05 14:07:33,864:wsdd INFO(pid 15312): using pre-defined UUID [removed]
2018-12-05 14:07:36,428:wsdd ERROR(pid 15312): No multicast addresses available. Exiting.

~/src/wsdd/src $ ./wsdd.py -i eth2 -i eth1 -vvv
2018-12-05 14:17:46,050:wsdd INFO(pid 15485): using pre-defined UUID [removed]
2018-12-05 14:17:50,289:wsdd INFO(pid 15485): joined multicast group ('239.255.255.250', 3702) on 10.x.x.x%eth1
2018-12-05 14:17:50,316:wsdd DEBUG(pid 15485): transport address on eth1 is 10.x.x.x
2018-12-05 14:17:50,348:wsdd DEBUG(pid 15485): will listen for HTTP traffic on address ('10.x.x.x', 5357)
^C
[cleanup stuff removed]

~/src/wsdd/src $ ./wsdd.py -v  
2018-12-05 14:28:59,543:wsdd WARNING(pid 15701): no interface given, using all interfaces
2018-12-05 14:28:59,570:wsdd INFO(pid 15701): using pre-defined UUID [removed]
2018-12-05 14:29:04,474:wsdd INFO(pid 15701): joined multicast group ('239.255.255.250', 3702) on 192.x.x.x%eth0
2018-12-05 14:29:05,925:wsdd INFO(pid 15701): joined multicast group ('239.255.255.250', 3702) on 10.x.x.x%eth1
2018-12-05 14:29:05,995:wsdd INFO(pid 15701): joined multicast group ('ff02::c', 3702, 22364, 2) on fe80::..%eth0
2018-12-05 14:29:15,002:wsdd INFO(pid 15701): handling WSD....

All done on a Linux box with Python 3.6.5, Linux 4.12.12 and uclibc-ng 1.0.30-r1

Could you provide steps to reproduce the issue along with a system description?

SerialVelocity commented 5 years ago

Ah, sorry. This isn't actually to do with no IP existing. This is actually hit on my tun0 device.

Here is a repro:

# ip tuntap add dev tun0 mode tun
# ip link set dev tun0 up
# ./wsdd.py -vvv
2018-12-05 13:50:30,538:wsdd WARNING(pid 19118): no interface given, using all interfaces
2018-12-05 13:50:30,539:wsdd INFO(pid 19118): using pre-defined UUID <redacted>
Traceback (most recent call last):
  File "./wsdd.py", line 711, in <module>
    main()
  File "./wsdd.py", line 700, in main
    addresses = enumerate_host_interfaces()
  File "./wsdd.py", line 502, in enumerate_host_interfaces
    family = deref.addr[0].family
ValueError: NULL pointer access
christgau commented 5 years ago

Thanks for the clarification. I'll look into this asap.

Nodens- commented 4 years ago

@christgau I think I am hitting the same issue with a wireguard vpn interface as well.

Feb  4 02:39:58 wintermute systemd[1]: wsdd.service: Succeeded.
Feb  4 02:39:58 wintermute systemd[1]: wsdd.service: Consumed 2.267s CPU time.
Feb  4 02:42:11 wintermute wsdd.py[1733]: ERROR: No multicast addresses available. Exiting.
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Main process exited, code=exited, status=1/FAILURE
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Failed with result 'exit-code'.
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Scheduled restart job, restart counter is at 1.
Feb  4 02:42:11 wintermute python3[1869]: detected unhandled Python exception in '/usr/local/bin/wsdd.py'
Feb  4 02:42:11 wintermute wsdd.py[1869]: Traceback (most recent call last):
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/local/bin/wsdd.py", line 757, in <module>
Feb  4 02:42:11 wintermute wsdd.py[1869]:    sys.exit(main())
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/local/bin/wsdd.py", line 751, in main
Feb  4 02:42:11 wintermute wsdd.py[1869]:    serve_wsd_requests(addresses)
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/local/bin/wsdd.py", line 712, in serve_wsd_requests
Feb  4 02:42:11 wintermute wsdd.py[1869]:    http_srv = klass(interface.listen_address, WSDHttpRequestHandler)
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/lib64/python3.7/socketserver.py", line 452, in __init__
Feb  4 02:42:11 wintermute wsdd.py[1869]:    self.server_bind()
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/local/bin/wsdd.py", line 61, in server_bind
Feb  4 02:42:11 wintermute wsdd.py[1869]:    super().server_bind()
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/lib64/python3.7/http/server.py", line 137, in server_bind
Feb  4 02:42:11 wintermute wsdd.py[1869]:    socketserver.TCPServer.server_bind(self)
Feb  4 02:42:11 wintermute wsdd.py[1869]:  File "/usr/lib64/python3.7/socketserver.py", line 466, in server_bind
Feb  4 02:42:11 wintermute wsdd.py[1869]:    self.socket.bind(self.server_address)
Feb  4 02:42:11 wintermute wsdd.py[1869]: OSError: [Errno 99] Cannot assign requested address
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Main process exited, code=exited, status=1/FAILURE
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Failed with result 'exit-code'.
Feb  4 02:42:11 wintermute systemd[1]: wsdd.service: Scheduled restart job, restart counter is at 2.
SerialVelocity commented 4 years ago

@Nodens- That looks like something is already bound to the port

christgau commented 4 years ago

@SerialVelocity yes and no @Nodens- this is actually more related to #18

I assume that the journal excerpt is from system startup. It appears that two situations occur here:

  1. wsdd is started but no interface is usable. Either because there is none available or there are no addresses usable on the used requested interfaces. In that case, wsdd terminates because there is nothing it can do. This is what happens to process 1733. So this is slightly related to this issue (#1)
  2. For process 1869, it is very likely that the situation described in https://github.com/christgau/wsdd/issues/18#issuecomment-574785772 appears. As discussed over there, a workaround is to delay the start of wsdd in the unit file. If wsdd can be started manually after boot, i.e. when the interface in question has its addresses configured, then it would support the theory. The actual fix for the underlying problem is discussed in #25.
Nodens- commented 4 years ago

@christgau Sorry for the late response, I was out of the city. After examining the data and testing I believe you are spot on. I am using fedora as well and I was not aware of the DAD mechanism at all. I am running on dual stack and indeed I have set my systemd unit file to wait on network-online and wssd only starts properly if started manually after boot.

For now I am handling it by letting systemd restart the unit until it works with a 5 second interval. I think it's a safer workaround than using a sleep timer hack that may be hit or miss (duration required) depending on other factors. The only side-effect is log pollution until it starts.

Looking forward to https://github.com/christgau/wsdd/issues/25 as well!