Using various source ports makes firewall rules difficult

NathanC commented 6 years ago

First off, awesome library! Thanks for writing it.

tl;dr; Response to discovery from a lifx device is unicast to the source port, which is assigned by the OS. This makes it difficult to create a firewall rule allowing the response.

lan discovery doesn't work for me, due to my deny-all-incoming-by-default firewall rules. I allowed incoming UDP messages on 56700, but it still didn't work-- I had to allow a large range of incoming UDP ports, at which point discovery worked.

I captured my traffic on wireshark to investigate, and noticed that every discovery broadcast came from a different source port, and that my lightbulb ended up replying to that source port, instead of 56700. Investigating why that was, I stumbled across this from the lifx-protocol-docs:

As of LIFX Protocol V2, LIFX devices sending unicast message responses will reply to whichever port number the client has bound. See message header frame source identifier regarding the conditions under which unicast messages are sent.

Prior to LIFX Protocol V2, i.e the Original LIFX A21 lightbulb running firmware version 1.x, LIFX devices send broadcast message responses to port 56700.

To get the best compatibility across both newer and older LIFX devices, clients should bind to UDP port 56700.

Looking through the code, it seems like lifxlan allows the OS to assign a source port for the broadcast in the initialize_socket method here.

I'd love to be able to specify which port is bound to for the broadcast (or a range), so that I can allow incoming UDP messages more selectively. Does that sound like a reasonable solution to you? If so, I could open a PR to expose those arguments to any consumers who wish to specify (with a default argument that leads to the existing behavior of an OS-selected port).

mclarkk commented 6 years ago

Originally the LifxLAN object only bound to 56700. However, at some point I ran into conflicts where that socket was already bound, so I ended up incrementing the port.

When I wrote that function, it was written like this. If 56700 was in use it would simply increment and retry until it found a free port. It looks like a commit from @exking changed that code to allow the OS to automatically select the port.

So I have three questions:

Should the library just fail if it tries to run discovery but 56700 is already bound? Seems like that would give it the best backwards compatibility with the LAN protocol, and the most firewall-friendly behavior, but sometimes then the library just wouldn't be able to run discovery/broadcast stuff using the LifxLAN() object.
@NathanC: If the socket code was changed back so that it would open port 56700 OR try the next ports, would that work better for your firewall rules? Can you allow a range of ports, say 56700 - 56720, or something?
@exking Is there some bug you fixed by letting the OS choose the socket, or did you change it because you thought it was cleaner?

exking commented 6 years ago

In multi-threaded application (for example group command) - using a fixed port is not acceptable. Previous logic was just incrementing the port number, but 2 threads running simultaneous could be competing incrementing the port as well.

I believe @NathanC's problem only happens during the initial discovery when broadcast packet is sent from a random port. Replies would come back from each individual bulb as a unicast, but his firewall is not letting them thru. Initial discovery should not really involve threading so using a fixed port for discovery only would be acceptable. (Another solution would be just to whitelist bulb IPs).

Later when unicast packets are sent to the bulbs - firewall should track the requests and let the replies back as "established" connection so using random OS assigned port for each request should not be a problem.

NathanC commented 6 years ago

Yeah, the issue is just during the unicast response to a broadcast during discovery, since firewalls don't seem to see that as an established connection. I think letting the OS select the port for the direct unicast -> unicast flows makes sense and firewalls generally allow it as established as @exking pointed out.

I think the default behavior should stay as it is; if it's working for people overall now, there's no reason to change it. I'm personally in favor of just exposing an argument for discovery that allows a consumer to specify which port to try to bind to, if they so choose.

For my personal scripts, I'd just bind it to port 56700, which shouldn't conflict with anything on my system. For more robust code that still binds to a set of ports I've opened, I could recover from binding errors and keep looping until I find a port that works for me.

e.g.,

def get_lights(self, specified_port = None):
     # propagate this down to the `initialize_socket` method.
     # if specified_port is None, then use a random port-- otherwise, 
     # attempt to bind to the specified port and raise an error if unable to

Or something like that. I'm pretty busy for the next week or so, but can open a PR when things settle down if the solution seems reasonable to you. There are other potential solutions, but this seems like a pretty non-invasive one; if people want to leverage it they can, but they don't have to.

(@mclarkk I think that the old flow of starting at 56700 would mainly solve this for me, but be a bit less clean. For users who don't care what the port is, it would cause delays with failed bindings. And for users who really care about the port used, it would still be a bit frustrating, since it wouldn't give full control over it. Adding an optional specification of port seems to cover both demographics of consumers pretty well. Another option could be a port range, with the default range being very large, allowing the OS to select a port in the range-- but at that point, it's potentially getting overcomplicated.)

tacaswell commented 6 years ago

I have a possibly silly protocol question: why keep creating new sockets? Could the each {context, group, device} create a persistent socket that it re-uses?

mclarkk / lifxlan

Using various source ports makes firewall rules difficult #114