microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.24k stars 811 forks source link

DNS issues in WSL2 #8365

Open OneBlue opened 2 years ago

OneBlue commented 2 years ago

Version

Multiple Windows builds are affected

WSL Version

This issue is here to merge DNS related issues in WSL2.

Symptoms include:

This issue does not cover scenarios where /etc/resolv.conf is manually edited.

If you're hitting this, please upvote / comment and upload logs

stijnherreman commented 2 years ago

@OneBlue I've posted repro steps in #8236 for one of the causes.

lbarbaglia commented 2 years ago

Hi, I'm having the exact same issue so I've collected some logs in case it can help: WslLogs-2022-05-10_16-27-14.zip

Even modifying the /etc/resolv.conf is not working anymore.

CraigHutchinson commented 2 years ago

I am getting this issue on fresh installation of Windows 11 with WSL2 Ubuntu image, really annoying issue!

[WSL] sudo apt update = ... Temporary failure resolving 'archive.ubuntu.com' ... [WSL] cat /etc/resolv.conf = ... nameserver 172.23.48.1 [WSL] ping 172.23.48.1 = From 172.23.62.236 icmp_seq=3 Destination Host Unreachable [WSL] ping google.com = ping: google.com: Temporary failure in name resolution [Windows] ping 172.23.48.1 = Reply from 172.23.48.1: bytes=32 time<1ms TTL=128

Attached are the logs. WslLogs-2022-05-17_10-17-13.zip

NOTE: ON Windows 11 I got this error when running the capture so they may be incomplete? image

r2evans commented 2 years ago

@CraigHutchinson , your comment appears to mimic what I'm seeing, where the problem is somehow in the routing and not just the name resolution. Have you found any workarounds?

MikaelUmaN commented 2 years ago

4285 was already tracking this. I consider this issue the /dupe #4285

BtbN commented 2 years ago

There were multiple open issues, all about the functionally same issue. Hence, as the initial description says, this exists to merge and declutter them.

dlaudams commented 2 years ago

There were multiple open issues, all about the functionally same issue. Hence, as the initial description says, this exists to merge and declutter them.

If this leads to a fix, this is a great outcome.

However the way it was handled may alienate the community. i.e., closing all the related issues without discussion or a clear reason provided in those issues.

unowiz commented 2 years ago

It might be to do with Windows Defender settings. resolv.conf and wsl.conf based approach didn't work for me. sudo apt update && sudo apt upgrade worked immediately after I turned off the Private network firewall. Once the update completed, I've put the firewall for private network back on.

On Windows 11, Go to Windows Security (from system tray, right click on Windows Security icon and select "View security dashboard" or simply search for "Firewall and network protection" after you press the windows key). Within the Firewall and network protection page, you should see Domain network (if domain connected), Private network, Public network. Go for the private network an turn it off temporarily as a workaround. Hope this helps.

Shellishack commented 2 years ago

I may have found another way to fix this. Originally I had this problem after using a proxy software. I just edited resolv.conf. It worked well until I realized that I also couldn't ping to Windows from WSL.

For some reason, the vEthernet (WSL) adapter on my PC was treated as a public network. Disabling public firewall or turning off the option "block all incoming connections, including those in the list of allowed applications" in Control Panel fixed everything. I also attempted to change its connection profile to private using PowerShell, but Get-NetConnectionProfile can't even find it while both ipconfig and Get-NetIPconfiguration can display some limited info about it.

zugazagoitia commented 2 years ago

It might be to do with Windows Defender settings. resolv.conf and wsl.conf based approach didn't work for me. sudo apt update && sudo apt upgrade worked immediately after I turned off the Private network firewall. Once the update completed, I've put the firewall for private network back on.

On Windows 11, Go to Windows Security (from system tray, right click on Windows Security icon and select "View security dashboard" or simply search for "Firewall and network protection" after you press the windows key). Within the Firewall and network protection page, you should see Domain network (if domain connected), Private network, Public network. Go for the private network an turn it off temporarily as a workaround. Hope this helps.

This seems to be a fix for me too, Windows Firewall must be blocking DNS queries originating inside the WSL VM from reaching the DNS server at the host.

Ray-Barker commented 2 years ago

Tried to disable Windows Defender Firewall on Windows 10, doesn't help. Tried manually editing /etc/resolv.conf in my Ubuntu 20.04 WSL2 by adding 8.8.8.8 and 1.1.1.1, it helps, but these servers don't work in our VPN. What helped me as a workaround was adding my router's IP as a nameserver to resolv.conf since it has DNS server capability. But I would like a more generalized solution.

mbwhite commented 2 years ago

Windows 10 with Ubuntu 20 in WSL2 : got some reproducible failures today for the first time; and it's confirmed something I've suspected but never been able to prove.. that there might be a connection with running the docker daemon.

Everything is working correctly (as fas as DNS goes), start the docker daemon (just a plain sudo dockerd ) afterwards, the 'temporary failure' error occurs.

Logs attached. WslLogs-2022-06-08_16-56-39.zip

jikuja commented 2 years ago

For me https://github.com/microsoft/WSL/issues/7555 gave really good pointers for fixing the issue.

Fixes that works for me:

I cannot recommend either of those to anyone because the first solution just breaks security and the second one might open some vulnerabilites.

AlexHunterCodes commented 2 years ago

My vEthernet (WSL) connection on a fresh Windows 11 install came with a Public profile too. I normally have "Blocks all incoming connections, including those in the list of allowed apps" enabled in the Windows Defender Firewall for untrusted networks, but I had to disable it to fix DNS resolution in WSL2.

The WSL2 Hyper-V virutal switch is an internal one and is not shared with your host adapter, so theoretically it shouldn't be a security issue for this network to be assigned a Private profile instead of a Public one.

That said, I don't see how I can change it since the adapter doesn't show up in Network and Sharing Centre or Settings, and it doesn't show up in the registry (Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\NetworkList\Profiles) either.

BtbN commented 2 years ago

Can you change it via Set-NetConnectionProfile in an elevated PowerShell prompt?

MatthewSkingley commented 2 years ago

Turning off Bluetooth and Wi-Fi hotspot on my laptop worked this time, sometimes restarting LxssManager works

AlexHunterCodes commented 2 years ago

Can you change it via Set-NetConnectionProfile in an elevated PowerShell prompt?

No, networks attached to vEthernet interfaces for Hyper-V internal virtual switches don't appear as valid networks in Get-NetConnectionProfile. Also doesn't work if you specify the vEthernet adapter by name.

Mithras commented 2 years ago

I've decided to give Docker 4.x another try to see if WLS2 is still busted and tried to do what was suggested in https://github.com/microsoft/WSL/issues/4285#issuecomment-1180567785 Using it for a couple days already and everything seems to be fine so far. So for me the issue was not firewalls, VPNs, etc. It was because WLS2 doesn't like default Docker network bridge. See https://docs.docker.com/network/bridge/#configure-the-default-bridge-network

BtbN commented 2 years ago

If docker really uses the same subnet than WSL2 does, it's not surprising that breaks stuff. Though for me that was never an issue, so maybe there's some randomness or auto-detection to the network WSL2 and/or Docker uses?

r2evans commented 2 years ago

@mateusz91t, just curious ... why use tee there? You are explicitly dumping its sole purpose: send to both the output and a file, but you redirect output to /dev/null, so why do that when echo "nameserver 8.8.8.8" > /etc/resolve.conf would be far simpler and direct. (FYI, this hasn't worked for me when a wifi change caused WSL2 to stop working correctly. For me it is a routing problem, not a name-resolution problem. That is, name-resolution will report failing when routing doesn't work, but that doesn't mean that dns is the core problem.)

mateusz91t commented 2 years ago

@r2evans, it is a workaround that helps me. If you use VPN too, try it please. https://github.com/sakai135/wsl-vpnkit Found in this issue: https://github.com/microsoft/WSL/issues/5068

jlukic commented 2 years ago

@mithras How did you configure the docker bridge network to get things to play nice with WSL2?

jikuja commented 2 years ago

The problem solved itself for me without changing any IPs.

I remember seeing that docker's default bridge interface was overlapping with vEthernet (WSL) device subnet. After a few reboots and weeks of waiting subnets did not overlap anymore and I could remove the firewall rule changes I described here earlier.

Sadly I did not save logs/screenshots of the IP allocation when firewall was dropping DNS requests.

If the subnet of vEthernet (WSL) is randomly changing that might explain why only some of the Dockers users have this problem.

The next step probably would be getting someone to check how vEthernet (WSL) subnet allocation works/is supposed to work. @OneBlue do you know anyone?

isaac-infotrend commented 2 years ago

My fix was to run the stock Windows 10 network reset feature from searching for that in the start menu. After that run wsl and it will reinitalize the virtual switch for you and work just fine.

Noteworthy:

ecourtial commented 1 year ago

Same issue.

WSL2 worked perfectly and on the first attempt on 2 of our 4 PCs running Windows 11. For the two others : we have this DNS issue. We tried everything among the billion of various solutions described on the Web... No result.

keith-horton commented 1 year ago

A couple of comments for some of the issues being described here.

There's a known issue where the necessary Firewall Rules to all the DNS request to be proxied are incorrect, and thus block DNS requests from the WSL container. We have put a fix for this in this next WSL release: https://github.com/microsoft/WSL/releases/tag/0.70.5

Secondly, there's a known Firewall configuration which will always block proxied DNS requests from the WSL container: this is the "BlockAllInbound" setting on network profiles. You can see if this is set by opening "Windows Security" - clicking "Firewall & network protection" -- then clicking on one of the 3 Network Profiles which is being applied to your connected network adapters (Domain, Public, or Private). That page lists a setting, "Blocks all incoming connections, including those in the list of allowed apps." ---> if this is checked, then DNS requests will not be proxied from the WSL container.

MarcoGorelli commented 1 year ago

In my case, this was due to Norton's Smart Firewall - turning that off resolved the issue

jikuja commented 1 year ago

@keith-horton Did anyone investigate how much DNS problems have been caused by usage of Docker? There has been few reports that the issue is triggered as soon as docker daemon has been started:

Is it even sane use case to have overlapping subnets on vEthernet (WSL) and docker default bridge?

cr2007 commented 1 year ago

Update: Now the issue seems to have been resolved in the latest Windows update under the 2022-10 Cumulative Update for Windows 11 Version 22H2 for x64-based Systems (KB5018496)

I am now able to do sudo apt-get update and code . to open VS Code via the server.

vbrozik commented 1 year ago

Will fixes like this and updates be released for WSL on Windows 10? Many enterprises will be using Windows 10 for a long time to come.

keith-horton commented 1 year ago

Is it even sane use case to have overlapping subnets on vEthernet (WSL) and docker default bridge?

I'm not super familiar with Docker. Having overlapping private NAT'd subnets that both need to get NAT'd (probably both by the WinNAT driver in Windows) is likely going to create issues. I'll need to follow up with folks on Docker.

chrisjsmith commented 1 year ago

After battling with this for over two years, I have dumped WSL entirely and gone back to VirtualBox + Debian + PuTTY.

We have no chance of getting Windows 11 and we don't really want it anyway so the fix target is a joke.

I'm not sure why I even bothered at this point. It's just been pain and friction, for what?!?

Edit: also the handling, as per MSFT policy, is terrible. This was raised in #4285 which was closed in favor of this.

fredxia commented 1 year ago

See if this link helps. https://github.com/microsoft/WSL/issues/4285#issuecomment-1180567785

Basically a change to /etc/docker/daemon.json:

{
    "bip" : "10.10.0.1/16"
}
jordansissel commented 1 year ago

From https://github.com/microsoft/WSL/issues/8365#issuecomment-1298001836,

seems to have been resolved in ... 2022-10 Cumulative Update for Windows 11 Version 22H2 for x64-based Systems (KB5018496)

@cr2007 I wonder what changed on your system to make things work again. A bug fix in the firewall rules hasn't been released yet. I just received this update (KB5018496) and the problem is not yet resolved.

I see WSL 0.70.5 is not yet available, so I'll wait until then to try again.

noraab commented 1 year ago

Can anyone give me a hint how to check the WSL version currently installed?

stijnherreman commented 1 year ago

@noraab run wsl --version in PowerShell.

PS C:\Users\stijn> wsl --version
WSL version: 0.70.5.0
...
noraab commented 1 year ago

@stijnherreman thank you for your reply. Unfortunately, that prints the help-text on my system. Could it be, that wsl --version only works on Windows 11? I'm on Windows 10.

BtbN commented 1 year ago

I think you need the Windows Store version of WSL2. Not sure if that's available on Windows 10.

fredxia commented 1 year ago

I think you need the Windows Store version of WSL2. Not sure if that's available on Windows 10.

I'm running WSL2 on Windows 10 Pro.

noraab commented 1 year ago

I think you need the Windows Store version of WSL2. Not sure if that's available on Windows 10.

Thanks, @BtbN. You're right, Windows Store version is required to run wsl --version. It is only available on Windows 11, even though it first looks like one can install it on Windows 10, it fails when trying to run it after installation.

LordVeovis commented 1 year ago

In my case, this was due to a network overlap between the WSL network and the Docker subnets.

Fixed it by editing /etc/docker/daemon.json like this and choosing an address pool that was outside of what has been set for the WSL subnet on my workstation:

{
        "default-address-pools": [
                {
                        "base": "172.28.128.0/20",
                        "size": 26
                }
        ],
        "userland-proxy": false
}

Killed wsl (wsl --shutdown) to reset virtual interfaces created by dockerd and voilà

jikuja commented 1 year ago

In my case, this was due to a network overlap between the WSL network and the Docker subnets.

Fixed it by editing /etc/docker/daemon.json

@keith-horton do you have information which IP address ranges are being allocated for WSL network by default? Would be easier to select docker's default address pool if WSL's random(?) address range is documented.

keith-horton commented 1 year ago

Hi there.

WSL IP allocations come from HNS - which finds an available IP prefix range from 172.17. to 172.32., though can use 192.168.* if need be. (Basically, the "class b" and "class c" IP prefix ranges).

domattioli commented 1 year ago

None of the suggestions I've tried on this (or other Github threads) worked for me. I tried everything except disabling my firewall, as many suggested.

This solution, however, did work: echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf > /dev/null

I'm not sure how this is different than the other solutions that suggested something similar. But it worked nonetheless.

I'm using WSL2, Ubuntu 22.04.

jikuja commented 1 year ago

There's a known issue where the necessary Firewall Rules to all the DNS request to be proxied are incorrect, and thus block DNS requests from the WSL container. We have put a fix for this in this next WSL release: https://github.com/microsoft/WSL/releases/tag/0.70.5

Will fixes like this and updates be released for WSL on Windows 10? Many enterprises will be using Windows 10 for a long time to come.

I'll mention this in here too: WSL 2 is now available on Microsoft Store for both Windows 10 and 11: https://devblogs.microsoft.com/commandline/the-windows-subsystem-for-linux-in-the-microsoft-store-is-now-generally-available-on-windows-10-and-11/

The article does not mention if Windows 10 and Windows 11 now has full feature-parity looks like some of the windows 11 -exclusive things are now available for Windows 10.

tgross35 commented 1 year ago

Unfortunately it seems like there is a chance that the 2022-08 security update might break this, at least on Windows 11. have never had an issue but KB5012170 was installed this morning, and now it doesn't work (nor does the /etc/resolv.conf workaround

Disabling the private network firewall (as mentioned here) allows internet connection, but updating (mentioned in the comment) didn't persist the fix when I re-enabled private firewall.

Edit: and now with cumulative update 2022-11 KB5020044, it's working again. Go figure 🤷

jabbera commented 1 year ago

I'm still getting blocks in my firewall log that I can't seem to get rid of. It's making me sad.

slonopotamus commented 1 year ago

Either configure your firewall so it no longer blocks WSL requests or contact your system administrator.

jabbera commented 1 year ago

@slonopotamus It seems unpossible. I put a wide open rule in both directions and they are still getting dropped.

image
2022-12-11 19:59:19 DROP UDP 172.24.23.144 172.24.16.1 46692 53 78 - - - - - - - RECEIVE 4792
2022-12-11 19:59:19 DROP UDP 172.24.23.144 172.24.16.1 46692 53 78 - - - - - - - RECEIVE 4792
2022-12-11 19:59:20 DROP UDP 172.24.23.144 172.24.16.1 40978 53 60 - - - - - - - RECEIVE 4792
2022-12-11 19:59:24 DROP UDP 172.24.23.144 172.24.16.1 46692 53 78 - - - - - - - RECEIVE 4792
2022-12-11 19:59:24 DROP UDP 172.24.23.144 172.24.16.1 46692 53 78 - - - - - - - RECEIVE 4792
jabbera commented 1 year ago

@slonopotamus I had to do this: Set-NetFirewallProfile -DisabledInterfaceAliases "vEthernet (WSL)"