troglobit / mdnsd

Jeremie Miller's original mdnsd
BSD 3-Clause "New" or "Revised" License
55 stars 35 forks source link

Run mDNS over IPv6, too #69

Open fzs opened 1 year ago

fzs commented 1 year ago

This PR should be the final step towards adding support for IPv6. (#10)

Two specific changes should be reviewed with respect to whether they are acceptable this way. First, I decided to use sockaddr_storage in the function signatures of mdnsd_in and mdnsd_out, instead of the general sockaddr. This is to make sure that a structure of the correct size is passed, without having to resort to the usual additional socklen_t addrlen parameter to check against. The drawback is that it (kind of) requires the calling code to use a struct sockaddr_storage which is larger than what the calling code may actually need. (The calling code could still decide to use a smaller sockaddr_in and cast to sockaddr_storage, but that looks, well, ew.)

Second, in process_out no additional IP version information is passed in, only the socket descriptor. The function employs the getsockname system call to find out if the socket is IPv4 or IPv6. I don't know if this could cause problems with portability, i..e. how well spread that system call is. It is part of POSIX.1-2001. As is getnameinfo, but that is only used in mdnsd, not in the library.


This PR is depending on #70 getting merged first.

fzs commented 1 year ago

Now for the real reason why this PR is in draft state. Implementation is not yet finished because I stumbled over a problem when creating IPv6 multicast sockets that I'd like to discuss.

The mdnsd_socket function is replaced by a IPv4 and IPv6 variant. When I test this in a short test program, which opens both types of sockets and just listens for incoming packets, I see the following peculiar behaviour. I test this on a Linux VM with two interfaces enp0s3and enp0s8. The first one only does IPv4 and the second one IPv4 and IPv6.

When specifying enp0s8 as an interface but not run with sudo (i.e. SO_BINDTODEVICE will fail), I see the program receive IPv4 and IPv6 mDNS messages. Sending a unicast UDP IPv4 packet to port 5353 on that interface is also received. A unicat IPv6 packet is not received by the program, even though I can see it in the tcpdump trace of that interface.

When specifying enp0s8 as an interface and run with sudo (i.e. SO_BINDTODEVICE will succeed), All packets to port 5353 are received, IPv4, IPv6, unicast and multicast.

When specifying no interface, i.e. not binding to any interface, the program receives unicast IPv4 packets send to enp0s8, but no IPv6 unicast packets sent to enp0s8. It will not receive multicast IPv4 messages seen on enp0s8, but it will receive IPv6 multicast messages seen on enp0s8.

When properly bound to enp0s3, i.e. run with sudo, it will receive no packets coming in through enp0s8, neither unicast nor multicast.

When specifying enp0s3 but not run with sudo(i.e. SO_BINDTODEVICE will fail), I see IPv6 multicast messages coming in on enp0s8, but not IPv4 multicast packets. I do see IPv4 unicast packets to port 5353 sent to enp0s8, but not IPv6 unicast packets sent to that port and interface.

I am at a loss as to why that is and what makes the behaviour differ in such a weird way between IPv4 and IPv6. Maybe I am missing something obvious to someone else.

Mapping run mode against packet coming in on interface enp0s8.

Run as IPv4 multicast IPv6 multicast IPv4 unicast IPv6 unicast
sudo mcrcv enp0s8 ✔️ ✔️ ✔️ ✔️
mcrcv enp0s8 ✔️ ✔️ ✔️
mcrcv ✔️ ✔️
sudo mcrcv enp0s3
mcrcv enp0s3 ✔️ ✔️
fzs commented 1 year ago

Turns out there are two different mechanisms at play causing the result above.

The first explains the "IPv6 unicast" column. I had tested this with avahi-daemon running at the same time. Which means that another process was bound to the 5353 port and ANY address, with SO_REUSEADDR and SO_REUSEPORT. The effect under Linux is that the process started later will get the IPv4 unicast packets while the process started earlier will get all the IPv6 unicast packets. This is the case when the two processes do not run under the same effective UID. (Otherwise reception would alter between the two.) Which was the case here, and thus my process got no IPv6 unicast packets.

The other issue is due to a misconception I had and how Linux behaves differently for IPv4 and IPv6 multicast packets in that case, and explains the "IPv4 multicast" column. I refactored the code for the socket out into a separate file. The code allows for no interface to be selected and will then use the "default" value for the interface when joining the multicast group. That is, it is up to the kernel to decide on which interface to join the multicast group. The kernel will use the one with the smaller index (in my case enp0s3). This means that when not specifying enp0s8, the multicast group will be joined on interface enp0s3 (the case for the last three rows in the table). This does not matter for IPv6 multicast packets, because any incoming multicast packets for that group will still be delivered to the socket. But for IPv4 only the packets will be delivered to the socket that come in on the interface on which the socket joined the multicast group. Which means in my case that when joining only on interface enp0s3, the IPv4 packets coming in on interface enp0s8 are not received on the socket.

I kind of was working with the idea that when not specifying an interface, all mDNS multicast packets should be received. (Which is true for the unicast ones.) But I see how this would be a problem with sending a packet because the outgoing interface would have to be selected somehow. This is currently not really a problem, because for mdnsd the interface is specified anyway, and for mquery a "default interface" is chosen by the program and specified when creating the socket, instead of leaving it up to the kernel. I am not entirely sure how useful this is, but at least it works.

troglobit commented 1 year ago

Huh, what a ride! I did not know about that behavior when running as different UIDs ...

The selection of interface sounds a bit weird though. Don't think the kernel actually goes by smaller ifindex, are you sure it didn't go by the routing table? It's IP networking after all.

I mentioned SMCRoute and pim6sd before, but maybe my little mcjoin program is a better match for this particular case. Have a look at receiver.c and sender.c in https://github.com/troglobit/mcjoin/tree/master/src I believe you may find useful IPv6 and IPv4 tips here (doesn't run as root)

fzs commented 1 year ago

Oh this is purely based on my observation. I did not go into the kernel code to check for the actual decision behaviour. The manual pages for the socket options are not really helpful here. Can you merge PR 70 for me so that I can continue on this one?

troglobit commented 1 year ago

Ah then I understand, makes more sense. Sorry for missing PR #70, merged now.

troglobit commented 1 year ago

Any progress on this, @fzs?

fzs commented 1 year ago

No, unfortunately this had to take a backseat behind other issues for a while. But it is not forgotten.

troglobit commented 1 year ago

OK, great to hear! No worries, just contemplating if I should do a release with what we have right now and postpone this for the next release after that ... any input from you on this would be great.

fzs commented 1 year ago

In the spirit of "release early, release often" I think a release with what is implemented now is fine. I would not hold back any release waiting for this PR to be finished. For one, because right now I cannot say with any certainty when I will be able to finish this. And also, I tried to split all changes into independent PRs so that a release would be possible at any time.

troglobit commented 1 year ago

Awesome, thank you for the reply! I'll start preparing for the release then :smiley: