troglobit / mdnsd

Jeremie Miller's original mdnsd
BSD 3-Clause "New" or "Revised" License
55 stars 35 forks source link

Fix access to freed iface #74

Closed MonkeyBreaker closed 1 year ago

MonkeyBreaker commented 1 year ago

Hi,

I had an issue on my system that from time to time mdnsd was crashing. After debugging, I found out that the reason was because performing a mdnsd_step on a freed iface was dereferencing the iface->mdns->disco.

The reason I observed this on my system is because from time to time one of my network interfaces goes down and is up again after a couple of minutes.

And, when the interface went down, the following mdnsd_step returned an error that then calls free_iface(iface).

When this function is called, it frees the memory allocated to the iface, but not the iface itself. When not freed, this iface is also still present in the internal list used to store the ifaces, see here.

This fix calls iface_freein function free_iface, I don't know if some renaming should be put in place here, can be a bit confusing.

One additional point I did not yet add because I wanted some feedback, is to reload = 1; after calling free_iface here. The reason for me would be to not remove forever an interface that we only had a network issue, we therefore add it again if reload=1.

PS: Not able to run the test, always get a FAIL after calling mquery.

BTW, I run mdnsd -n usually on my system.

troglobit commented 1 year ago

Hi, this looks like an obvious fix, but I'm curious exactly how to recreate the crashes you see. What version of mdnsd are you running, and what does "interface went down" mean? Because I cannot reproduce this locally with the latest git master, tried both link down and ifconfig eth0 down.

MonkeyBreaker commented 1 year ago

Hi,

I am using latest master, ea16d67.

When I observed it on my system, the network interface is up e.g. eth0, then I start mdnsd, mdnsd -n, and after like a couple of minutes, the network interface is deleted (it is some C code, but should be like ip link delete eth0), here I am not sure how much it should wait before adding again the interface, but at least 2-3 min.

Then the interface is add it again, I think the important point is to have the same name, ip link add eth0 (something similar to this, as I said, those interfaces are added from some C code). And then, I observe that mdnsd_step returns an error, and this creates the issue mentioned previously.

Hope this helps, at least this is what I observe on my system.

troglobit commented 1 year ago

Ah, of course, deleting the interface. That makes sense. Thank you!

I'll write a test for this case, verify your patch and look into the reload mech.

If I get some time over the holidays I'll also look into issue #48 to see if we can handle interface add/del better than using the reload flag.

troglobit commented 1 year ago

Reproduced! :partying_face:

MonkeyBreaker commented 1 year ago

Hey,

Good news then, I see that on my first description, it was not straightforward that I mean delete and not shutting down the interface.

If I get some time over the holidays I'll also look into issue https://github.com/troglobit/mdnsd/issues/48 to see if we can handle interface add/del better than using the reload flag.

I don't know if I have the time to look on that, but it would definitely be a nice idea.

About the regression test, I can have a look to implement it, or of course if you feel of doing it, I won't hold back 😄 .

troglobit commented 1 year ago

No worries, it's really hard to do bug reports! :-)

I'll do the regression test, and when I get it working (failing that is), I'll merge your PR.

Issue #48 is more of an optimization, so not really needed, just nice to have. So you don't have to spend time on it, I was just connecting the dots. If you do have time, however, I'd really appreciate your retesting all this when it's been merged.