Closed obdevel closed 1 year ago
I never looked into the MDNS code so this might be a tricky one. It's taken verbatim from the ESP8266 core where I think I remember a similar issue.
@d-a-v, sorry to bring you in, but didn't you just look into something similar and found it was actually per-spec?
I tried a couple of previous versions to see if I could find a regression. Both 2.7.0 and 2.7.3 work correctly for at least 30 mins, so it seems that the regression occurred in 3.0.0, perhaps in the underlying network code or elsewhere. I'll leave the 2.7.3 test running overnight.
There is nothing in 3.0.0 that I urgently need, so I can stick with 2.7.3 for now.
There was just a change to the LWIP stack which may have an impact here (but I'm not equipped to test). I'd recommend trying 3.1.0 when it comes out. I'll close this and we can revisit if needed.
In my case (running version 3.0.0) LEAMDNS doesn't stop responding. It didn't respond in the first place.
When LEAMDNS starts, it announces to the network "Hey I am here" Other MDNS hosts on your network cache this information for 2 minutes. In my case it was Bonjour on windows doing this. You can check by opening a windows command windows and run dns-sd -Q yourhostname.local.
it will keep running and show domains added and after 2 minutes you can see it is removed. With version 2.7.3 the cache duration seems much longer.
Either way, android will not find "yourhostname.local" because that reply must come from the matching host or from the local MDNS cache on the phone. Since the host can't hear the multicast messages and there is no local cache the ip address is not found.
All your tests were probably done on the same machine which made it appear LEAMDNS replied but it didn't, it was bonjour returning cached data.
You can also run sudo tcpdump dst 224.0.0.251 and udp and ip and port 5353
on a linux machine to see what is flying around the network in terms of MDNS
I believe Raspberry pi pico w can not receive multicast packets due to an issue that is way beyond my comprehension. I am so glad I found this post. My entire project depends on the ability to use MDNS.
Earle, if there a way I can test the change to the LWIP stack now?
I had a similar experience to @dinther , I happened to update to 3.0.0 and then started adding LEAMDNS for the first time and no response. Downgrading to 2.7.3 I started getting a response.
So why does name resolution continue to work after 120 secs in 2.7.3 (and earlier) but not in 3.0.0 ? The clients are identical in all test cases, macOS and iOS. I don't have access to Windows or Android.
@dinther thanks for the detailed explanation. I think this is all related to the new CYW43 binary blob. It appears to block all multicast MACs by default now (whereas SDK 1.4's passed them thru). LWIP isn't even made aware of them, they're just dropped inside the WIFI chip.
I've gone thru Wikipedia and found the 2 MACs that MDNS seems to use for m-cast and added manual calls to the new APIs in #1290. Can you give that a try? I've had the PicoW up for 800 seconds now and avahi-discover
is still showing cbusserver
for me, so AFAICT it's now working, but I am by no means an AVAHI (or Ethernet or LWIP) expert so someone else's validation would be much appreciated.
(Also, if you pull #1290 then you will get the stability improved LWIP stack automatically since that was merget to master
. That's probably not related to this as it was causing a complete crash or hang in certain cases, not odd "seems like its working but doing nothing" as described here).
I try Earl, I am incredibly out of my depth here. I know how to use the Arduino IDE and that is the extend of it.
No worries, @dinther . Like I said, you explained very well what was going on at the low level. Made it easy to figure out the underlying problem!
I have just done a test where I ran the test case in the initial post with the name "oldcbusserver", waited 10 mins, then turned on a PC and tried to ping "oldcbusserver.local" and failed (like you saw). Turned off the PC. Then I checked out #1290, changed the name to "newcbusserver" and waited 10 more mins. Turned on the PC and tried to ping "newcbusserver.local" and it worked.
So I think we're good and I'll merge it and 3.1will be good to go.
MDNS seems to stop responding after a couple of minutes. The sketch continues to run and responds to pings to the IP address. MDNS.isRunning() continues to return true. I've tried from both my MacBook and iPhone (on a larger program with a webserver).
My minimum test sketch:
I'm testing from the macOS command line with:
Anything I'm doing wrong ? Any debug I can try ?
Core version 3.0.0 Arduino IDE 1.8.19
Thanks.