nanoframework / Home

:house: The landing page for .NET nanoFramework repositories.
https://www.nanoframework.net
MIT License
858 stars 78 forks source link

DNS lookup fails with SocketException #1008

Open jonmill opened 2 years ago

jonmill commented 2 years ago

Target name(s)

ESP32_REV0

Firmware version

1.8.0.6

Was working before? On which version?

N/A

Device capabilities

ESP32 (ESP32-D0WDQ6 (revision 1)) Features WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None Flash size 4MB unknown from (manufacturer 0x216 device 0x16406) PSRAM: not available Crystal 40MHz

Description

I'm getting a SocketException during Dns.GetHostEntry that is traced to the NativeSocket::getaddrinfo call (call stack below). The DNS entry does exist and can be retrieved using this call in a normal .NET Core app.

++++ Exception System.Net.Sockets.SocketException - CLR_E_FAIL (1) ++++
    ++++ Message: 
    ++++ System.Net.Sockets.NativeSocket::getaddrinfo [IP: 0000] ++++
    ++++ System.Net.Dns::GetHostEntry [IP: 0008] ++++
    ++++ Testing.Program::Main [IP: 013f] ++++
Exception thrown: 'System.Net.Sockets.SocketException' in System.Net.dll
An unhandled exception of type 'System.Net.Sockets.SocketException' occurred in System.Net.dll

WiFi is connected with a valid IP and DateTime and the address being looked up is infrastructure.local. A .NET Core app can lookup the address fine, but not the ESP32. There is no trace of the request hitting the DNS server

bool connected = nanoFramework.Networking.WiFiNetworkHelper.ConnectDhcp(
                WIFI_NAME,
                WIFI_PWD,
                WiFiReconnectionKind.Automatic,
                requiresDateTime: true,
                wifiAdapter: 0);
            if (connected == false)
            {
                throw new InvalidOperationException("Could not connect to WiFi");
            }
            else
            {
                Console.WriteLine($"Current Time Post-Connect: {now.Month}/{now.Day}/{now.Year} {now.Hour}:{now.Minute}:{now.Second}");
            }

How to reproduce

  1. Have a DNS server (ex pihole) resolve a .local address as an A record to a local server
  2. Have an ESP32 attempt DNS resolve for the .local address

Expected behaviour

The DNS entry resolves to an IP address

Screenshots

No response

Aditional information

No response

jonmill commented 2 years ago

Looks like this is a problem with .local domains where the mDNS lookup is not leaking out to the DNS server

Ellerbach commented 2 years ago

If my understanding is correct, mDNS is a bit different than DNS. It seems to have issue in most platforms. Found an "old" discussion on Stackoverflow: https://stackoverflow.com/questions/10244117/how-can-i-find-the-ip-address-of-a-host-using-mdns/35853322#35853322 It does point out issues with ESP implementation.

And mDNS packet structure described here: https://en.wikipedia.org/wiki/Multicast_DNS

torbacz commented 2 years ago

Not long ago, I was working with mDNS. I've tried to set up HomeAssitant server with mDNS on board. Everything works just fine from Windows machine, but Android phone was not able to resolve address. I've switched to normal DNS and it's working since. Maybe it's not nanoFramework issue? Maybe it's ESP issue?

BTW. I doubt that mDSN lookup will be visible in normal DNS server, from wikipedia When an mDNS client needs to resolve a hostname, it sends an [IP multicast](https://en.wikipedia.org/wiki/IP_multicast) query message that asks the host having that name to identify itself. That target machine then multicasts a message that includes its IP address.

Ellerbach commented 2 years ago

Maybe it's not nanoFramework issue? Maybe it's ESP issue?

it is most likely an ESP issue regarding the comments from other sources. Still, let's keep this issue open up to the point the root cause is properly identified or that there is no interest. And thanks for the additional info as well!

DaveSchmid commented 1 year ago

I've had exactly the same problem. It has nothing to do with .local. It also happens with .com. I believe it is due to a DNS request timeout that is too short. And it depends on the hardware. I have several ESP32 Devkit-C V4 boards. The same code works on some, throws an exception on others.

josesimoes commented 1 year ago

Currently we don't expose any setting related with the DNS execution timeout. Nor are we setting anything about this from the defaults in lwIP.

DaveSchmid commented 1 year ago

this idea with timeout is just a feeling of mine. A DNS query can take a certain amount of time depending on the DNS server. But the exception comes very quickly, too quickly. I would like to record the network traffic but currently it works without an error.

josesimoes commented 1 year ago

I can't reproduce this... always get the resolved address if it exists and an exception if it doesn't.

Worth pointing out that a SocketException it's the expected outcome when pretty much anything fails, so it may be hard to trace if this is because of a timeout or an issue with the resolution failing for a particular address. Until we have a consistent way of reproducing this it will be hard to even start looking at what could be wrong. Considering that we have no other reports about this, I'm considering filling this under "can't reproduce: won't fix"...

https://learn.microsoft.com/en-us/dotnet/api/system.net.dns.gethostentry?view=net-7.0#system-net-dns-gethostentry(system-string)