microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.02k stars 398 forks source link

DNS Resolution stops working on a machine having a local cluster #610

Open anantshankar17 opened 5 years ago

anantshankar17 commented 5 years ago

This is an article describing the underlying problem & possible workarounds. We are working to fix this issue. Developers have been reporting screwed up DNS resolutions on machines having SF Clusters deployed. This is only valid for wifi adaptor as they cache the network settings for each SSID.

  1. Network Adaptor connected to Wifi-A. Both IPAddress & DNSServer are auto assigned by DHCP.
  2. In the registry under this adaptor(Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces), the network stack (DHCP Client) caches the connection settings for this wifi network.
  3. When the SF dev cluster is setup, we save the DHCP IP address in the SF registry and plumb the same value as the DNSAddress, so that all resolutions go to the DNS service in the SF cluster.
  4. The network settings are cached in the registry by the DHCP client for this wifi network, so the NameServer value is saved.

User moves to Wifi-B. (or hibernates the laptop and reconnects).

  1. User does “cluster removal” / “stops cluster” or a reboot results in the cluster getting stopped, which restores the adapter to the original state by clearing up the NameServer added by SF. Now the cluster is no longer monitoring the adaptor for change notifications, and the Cluster DNSService is also not running.
  2. User reconnects to Wifi-A, and the previously cached settings from the registry are plumbed again onto the adapter. This NameServer may not be a valid address anymore, and with no DNSService running on this machine, the DNS resolutions stop working.

Workaround: -Remove the cached entries under the key for the wifi adapter under this path as shown in the image below: Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces. -Reset DNS Address to auto-assign in the network adapter settings in ncpa.cpl

Capture (2)

bkonicek-cs commented 4 years ago

Are there any updates on this? I saw that it appeared to be fixed in 6.5 CU3, but after completely removing my previous installation, and cleanly installing 7.0 it is still hardcoding two DNS servers onto my network adapters. Whenever I leave work and try to use my laptop at home I have to manually set its configuration back to automatic.

adamvanaken commented 4 years ago

Also curious for an update here, ran into the problem today. The NameServer and ServiceFabricNameServer had my machine's previous local IP saved, so when my network assigned me a new IP, nothing was responding and the DNSClient took off.

zmhh commented 3 years ago

Just adding another report that sometimes the internet (DNS resolution) on my machine will stop working, and I have to shut down local SF cluster then reset adapter settings to find DNS automatically, at which point using internet works again.

YairoR commented 2 years ago

Same here - once I'm starting the SF Emulator my internet stops working and I need to reset the DNS configuration