microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.47k stars 822 forks source link

WSL no internet connection / DNS issues #11693

Open cyberjj999 opened 5 months ago

cyberjj999 commented 5 months ago

Windows Version

Microsoft Windows [Version 10.0.22621.3737]

WSL Version

2.2.4.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.153.1-microsoft-standard-WSL2

Distro Version

No response

Other Software

No response

Repro Steps

This indicate a clear network problem.

Expected Behavior

No issues with network problem - ping works and pip install <package> should work.

Actual Behavior

Clear Network/Internet Connection Problem:

My host machine (Windows 11) has no internet issues at all.

What I've Tried

  1. Disable windows firewall entirely shutdown WSL and ping google.com again: doesn't work

  2. Ran the following command to flush my dns on windows:

    netsh winsock reset 
    netsh int ip reset all
    netsh winhttp reset proxy
    ipconfig /flushdns

    then restarted my computer: doesn't work

  3. Updated my /etc/resolv.conf and /etc/wsl.conf to put nameserver of 8.8.8.8 and 8.8.8.4 and even make the /etc/resolve.conf immutable... and it doesn't work.

    sudo rm /etc/resolv.conf
    sudo bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
    sudo bash -c 'echo "[network]" > /etc/wsl.conf'
    sudo bash -c 'echo "generateResolvConf = false" >> /etc/wsl.conf'
    sudo chattr +i /etc/resolv.conf
  4. Disabled "fast start-up" option in Power Options then restarted my comp... still doesn't work

  5. Chnaged from my company WiFi to my personal mobile hotspot - doesn't work

Suspected Reasons

  1. Change of network (VPN?) but disabling VPN doesn't yield a meaningful difference

  2. Windows Update (including Quality Updates)

Somehow I have WSL update automatically with an old kernel version though my WSL Ubuntu is installed from Microsoft store? enter image description here

But my WSL version seems to be okay

WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2

Appreciate Any Help

Diagnostic Logs

Added WSL Logs

WslLogs-2024-06-16_22-35-16.zip

github-actions[bot] commented 5 months ago

Logs are required for review from WSL team

If this a feature request, please reply with '/feature'. If this is a question, reply with '/question'. Otherwise please attach logs by following the instructions below, your issue will not be reviewed unless they are added. These logs will help us understand what is going on in your machine.

How to collect WSL logs Download and execute [collect-wsl-logs.ps1](https://github.com/Microsoft/WSL/blob/master/diagnostics/collect-wsl-logs.ps1) in an **administrative powershell prompt**: ``` Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1 Set-ExecutionPolicy Bypass -Scope Process -Force .\collect-wsl-logs.ps1 ``` The scipt will output the path of the log file once done. Once completed please upload the output files to this Github issue. [Click here for more info on logging](https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#8-collect-wsl-logs-recommended-method) If you choose to email these logs instead of attaching to the bug, please send them to wsl-gh-logs@microsoft.com with the number of the github issue in the subject, and in the message a link to your comment in the github issue and reply with '/emailed-logs'.

View similar issues

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it!

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

kohlerdominik commented 5 months ago

I have the exact same issue, started today as well.

WslLogs-2024-06-14_17-43-54.zip

cyberjj999 commented 5 months ago

@kohlerdominik It's a really annoying problem.

I found a quick fix incase it really bothers you - doesn't seem to affect my project code but things feel slower in general.

I just downgrade to WSL1 and it works

# Open Powershell
C:\WINDOWS\system32>wsl --list

Windows Subsystem for Linux Distributions:
Ubuntu (Default)

# Set WSL to Version 1
C:\Users\priv>wsl --set-version Ubuntu 1

It says it takes a few minutes but in reality it took almost an hour for me.

I'm hoping a proper fix will be here so I can switch back to WSL2.

Cheers and hope it helps!

P.S. I tried to switch back to WSL2 afterwards and the network issue still persisted, so I guess switching back and forth doesnt' work 👎

kohlerdominik commented 5 months ago

@cyberjj999 after investigating I found, that etc/resolv.conf is a symbolic link to a non-existing file. After creating that file with the following content, the name resolution works again

nameserver [our company dns server ip - don't add this if you only require internet dns]
nameserver 1.1.1.1

This is quite an ugly workaround, though, because it works staticly only for our work environment and the internet.

I guess the resolv.conf should be taken from the windows network adapter usually (so DHCP-conf is properly forwared), but this did break.

dcasota commented 5 months ago

Hi, There are quite a few differences between WSL1 and WSL2. See https://github.com/microsoft/WSL/issues/4150#issuecomment-1926022445. The thread is mentioned in the weblink above.

I'm using NAT mode, and this works perfectly for Photon OS on WSL2.

Wrong-Code commented 5 months ago

Yes, there is a difference between version 1 and 2, but currently version 2 doesn't work anymore, and version 1 still performs as expected. Thanks to @cyberjj999 for hinting on temporarily switch to version 1.

Running 10.0.22631.3737 FWIW

dcasota commented 5 months ago

On x86_64, WSL 2.1.5 works flawlessly with version 2. Regular installations are not affected, because wsl —update does not update to 2.2.4. WSL with version 1 is not an option for distributions with systemd requirements.

cyberjj999 commented 5 months ago

@dcasota

On x86_64, WSL 2.1.5 works flawlessly with version 2. Regular installations are not affected, because wsl —update does not update to 2.2.4. WSL with version 1 is not an option for distributions with systemd requirements.

I'm still deeply lost with what's happening. I happen to need to use ollama which requires wsl2 so my current approach of reverting to wsl1 will not work.

I have uploaded the logs above.

Incidentally, I saw in another issue (https://github.com/microsoft/WSL/issues/11675) mentioning similar things, and your suggestion for WSL reinstallation.

# see https://learn.microsoft.com/en-us/windows/wsl/install-manual#step-1---enable-the-windows-subsystem-for-linux

# Open a Powershell Terminal (Administrator) 
dism /online /disable-feature /FeatureName:Microsoft-Windows-Subsystem-Linux
dism /online /disable-feature /featurename:VirtualMachinePlatform
dism /online /disable-feature /FeatureName:Microsoft-Hyper-V
rm "$env:userprofile\.wslconfig"

# reboot

# Open a Powershell Terminal (Administrator) 
dism /online /enable-Feature /All /FeatureName:Microsoft-Windows-Subsystem-Linux /norestart
dism /online /enable-feature /All /Featurename:VirtualMachinePlatform /norestart
dism /online /enable-Feature /All /FeatureName:Microsoft-Hyper-V /norestart
bcdedit /set hypervisorlaunchtype auto

# reboot

# Open a Powershell Terminal (Administrator) 
wsl --install

I'll give it a shot tomorrow if there's no alternative solution.

Can I ask regarding the backup for WSL which you suggested: wsl --export <distribution-name> <path\filename.tar>, if WSL2 still doesn't have connection after I do your WSL reinstallation step, how can I re-import these backup contents into my new WSL2?

(I have custom bash scripts/commands + other installations like ollama and it'd be nice if i can easily re-import them to my new WSL2...)

github-actions[bot] commented 5 months ago
Diagnostic information ``` Issue was edited and new log file was found: https://github.com/user-attachments/files/15858698/WslLogs-2024-06-16_22-35-16.zip Detected appx version: 2.2.4.0 ```
dcasota commented 5 months ago

@cyberjj999

From my perspective, Ollama, security-hardened by VMware Photon OS (one of the origins of Cbl mariner), runs best in VMware Workstation. But built-in WSL in Windows 11 as homelab use-case has become good enough. Not all functions are fail-safe yet.

Clarification: wsl --unregister -d <distribution-name> deletes a custom distribution, but wsl --uninstall+WSL reinstallation not.

cyberjj999 commented 5 months ago

@craigloewen-msft @dcasota

Thanks for contributing. I hope to update that I've done even more steps

  1. registering another Ubuntu instance (didn't work)
  2. follow the whole process of deregistering WSL, uninstalling it + uninstalling Ubuntu completely + reimporting the profile back (didn't work)
  3. instead of reimporting, I created an entire new profile (still didn't work)

The network issue still persists despite all the attempts.

I sincerely seek your support., especially since I'd need WSL2 to run certain programs.... thanks

dcasota commented 5 months ago

Which constellation does not work?

In my homelab,

All tests were on VMware By Broadcom Photon OS 5.0 guest with findings from March to June 2024. I didn't start testing yet the new possibilities as described in https://devblogs.microsoft.com/commandline/whats-new-in-the-windows-subsystem-for-linux-in-may-2024/.

cyberjj999 commented 5 months ago

@dcasota

I'm using the default WSL configuration which is utilizing NAT by default, I presume.

I tried uninstalling the kb5039212 windows update as you mentioned: doesn't work with WSL v2.2.4

Downgrading to WSL v2.1.5 using the .msi installer you shared and confirmed downgrade using wsl --version: but internet connection still didn't work.

I'm starting to think it could be some network configuration issues but I really have no clues of what happened 1-2 weeks ago that randomly caused this to happen.

^ If the issue is VPN, wouldn't it work once I switch network? Because that didn't work... I'm rather clueless now...

dcasota commented 4 months ago

Troubleshooting WSL lists quite a few known issues e.g. constellations with Cisco anyconnect vpn, antiviruses which prevent wsl internet access, etc. Logfiles are needed for further investigations. Simply go step by step through the collecting wsl logs recipe.

Edited: From your logs provided and using Windows Performance Analyzer for the first time, I'm afraid, in System activity > 'regions of interest' >table details view mostly is empty and in WPP Trace >Process(Name) are displayed as unknown. A Microsoft wsl support enginer may help by providing a curated methodology to analyze logfiles.

Otherwise I would collect the information the classic way: 1) Windows eventlogs, 2) in .wslconfig set debugConsole=true, 3) in the Linux distro collect journalctl content e.g. show errors since last reboot with journalctl -p 3 -xb.

cyberjj999 commented 4 months ago

@craigloewen-msft are there any updates? Thank you @dcasota I already included my log as you've mentioned.

Currently working with raw windows and there's quite a number of packages that I can't even install to get my dev work going.

dcasota commented 4 months ago

@cyberjj999 From what I've seen, it matches to your observations. The service goal to safely deliver package updates at granular level from Windows to distros seems to work, but hiding the complexity wasn't possible in that time.

In registry, there are a bunch of entries with the naming schema <package>~<guid>~<architecture>~~<fileversionraw>, e.g. in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\Packages\ . To get an idea, the following lists all .mum packages in correlation with KB5039212:

Get-Childitem -Path Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\"Component Based Servicing"\Packages -recurse -ErrorAction SilentlyContinue | get-itemproperty | foreach-object{ if ($_.InstallLocation -ilike "*KB5039212*") {$_.InstallName}}

I finished the delay of KB5039212. It seems that the kb has been updated at granular level. Now I get

get-childitem c:\windows\system32\wsl* -include *.dll,*.exe | foreach-object { "{0}`t{1}" -f $_.Name, [System.Diagnostics.FileVersionInfo]::GetVersionInfo($_).FileVersionRaw }

wsl.exe 10.0.22621.3672
wslapi.dll      10.0.22621.3672
wslconfig.exe   10.0.22621.3672
wslg.exe        10.0.22621.3672

and not the .3737-files anymore. Wsl still is on version 2.1.5. I didn't update for the moment as it works flawlessly.

AnonymousWP commented 4 months ago

I also figured out internet isn't working for me anymore at all (Ubuntu 22.04, WSL 2.2.4). I first thought it was Docker or a temporary bug that could be fixed by a wsl --shutdown, but it isn't. Pinging IP addresses don't work, changing the DNS-server in resolv.conf doesn't work either. Seems like a new Windows update messed this up.

AnonymousWP commented 4 months ago

I fixed the issue by setting networkingMode to mirrored, so: networkingMode=mirrored in the wslconfig.conf file. Make sure you restart WSL after that: wsl --shutdown. The default for networkingMode is nat, which seems to be broken now.

dcasota commented 4 months ago

@AnonymousWP From your description

Wrong or Right?

Yes, resolv.conf, docker and distro-specific issues can be excluded. Actually wsl 2.1.5 with networkingMode=nat(default) works flawlessly, too.

AnonymousWP commented 4 months ago

Yes, correct. Now the question is: what's causing this? I got an update for WSL from the Microsoft Store some days (almost a week) ago, or is it the Windows update?

dcasota commented 4 months ago

I don't know the root cause.

The history was

Where is wslconfig.conf placed? In my environment, the configuration is in %userprofile%\.wslconfig and inside the distro the configuration file is in /etc/wsl.conf, but there is no wslconfig.conf (?)

cyberjj999 commented 4 months ago

@AnonymousWP thanks for your feedback: i added a .wslconfig file in my %userprofile% / C:/Users/MyUser/.wslconfig with the following content

[wsl2]
networkingMode=mirrored

and I still suffer from a lack of internet connection.

@dcasota I think the logs above might have been collected when my WSL was 2.2.4, but regardless, my current version is 2.1.5 (after I ran the .msi file you sent to downgrade my WSL. Afterwards, I also uninstalled the security updates you mentioned and... it still doesn't work.

Not quite sure what's left to try.

I'm considering switching to an entire dual-boot set up at the moment, but admittedly it'd be quite inconvenient...

dcasota commented 4 months ago

@cyberjj999 wsl 2.1.5 with networkingMode=nat and windows setttings ipv4 only should work. wsl 2.1.5 with networkingMode=mirrored didn't work in my home lab, too.

Wrong-Code commented 4 months ago

I think I have found what's going on with this issue, at least on Windows 11. With version 22H2, Microsoft has introduced the Hyper-V firewall, and depending on your Windows Defender Firewall configuration, the default settings for the Hyper-V firewall may impact negatively the WSL2 connections.

TL;DR solution if you are stuck with this issue (at least it works for me)

Note: The Hyper-V firewall can only be configured with PowerShell. Currently, there is no specific support for configuring it via GPOs, nor there is a GUI.

From an elevated PowerShell prompt, run this cmdlet:

Get-NetFirewallHyperVVMCreator

You should get the following:

VMCreatorId  : {40E0AC32-46A5-438A-A0B2-2B479E8F2E90}
FriendlyName : WSL

Now get the default configuration for WSL. Run:

Get-NetFirewallHyperVVMSetting -PolicyStore ActiveStore -Name '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}'

You should get the following:

Name                  : {40E0AC32-46A5-438A-A0B2-2B479E8F2E90}
Enabled               : True
DefaultInboundAction  : Block
DefaultOutboundAction : Block
LoopbackEnabled       : True
AllowHostPolicyMerge  : True

Allow the outbound traffic for WSL2:

Set-NetFirewallHyperVVMSetting -Name '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}' -DefaultOutboundAction Allow

Your Linux distros should now be able to connect to the LAN/Internet.

Longer version

If Windows firewall is configured to block all the outbound (or inbound) connections for which there is no rule, your WSL2 distros outbound/inbound connections will be blocked as well. The reason is that, by default, both inbound and outbound traffic to/from WSL2 distros is blocked. Configuring WSL2 to work in mirrored networking mode (which is the one I need) will not change this.

After having tried all the possible solutions, downgrades, upgrades, ... suggested on this bug issue, including the removal and reinstallation of the entire WSL (and depending) subsystems, I was always back to square one: none of my WSL2 distros would connect to the LAN or the Internet. I noticed however that ICMP traffic was permitted everywhere (or everywhere your core/border firewall, if you happen to have one, permits).

I realized I had to investigate the Windows firewall. Only, I could not find any trace of blocked connections in the Windows firewall log. Moreover, even using the nifty Windows Firewall Control (WFC) utility by Binisoft (now Malwarebytes) WSL2 distros attempts to communicate did not raise any notification. With WFC, I normally use the Medium Filtering, which corresponds in Windows Defender Firewall parlance to block inbound or outbound connections for which there is no rule. I changed temporarily WFC configuration to Low Filter, which enables all the outbound connections even if there is no specific rule for them, and immediately my Linux distros started to communicate.

A quick Internet search with keywords WSL2 and firewall pointed me to this Microsoft page, Configure Hyper-V firewall, a feature I was not aware of. I quickly realized that with my Windows firewall configuration, which is not the default, the default settings of the new Hyper-V firewall cause the issue. The TL;DR steps I've shown above is what has fixed my WSL2 connections problems.

A couple of related notes:

WSL --version
WSL version: 2.1.5.0
Kernel version: 5.15.146.1-2
WSLg version: 1.0.60
MSRDC version: 1.2.5105
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22631.3737

I downgraded WSL to 2.1.5.0 following this long thread, but I have the feeling it should work even with the latest WSL version.

EDIT

Confirmed: upgrading to the latest WSL 2.2.4.0 the fix still works. There should be no need to remove the latest Windows patches, downgrade or reinstall WSL whatsoever. All considered, I also don't think KB5039212 has anything to do with this issue.

EDIT 2

ICMP traffic is permitted because there are specific rules for that. See the output of

Get-NetFirewallHyperVRule -VMCreatorId '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}'
dcasota commented 4 months ago

@Wrong-Code The regression complexity is the main culprit for the many wsl issues. Obviously this was and is not the case yet.

Windows 11 Pro with KB5039212, wsl 2.1.5, hyper-V firewall settings { DefaultInboundAction=Allow / DefaultOutboundAction=Allow / LoopbackEnabled=true / AllowHostPolicyMerge=true } does not work with networkingMode=mirrored in my homelab with Photon OS 5 and ipv4only with no vlan/vpn/bond.

In March, I started with some similar hyper-V firewall research findings of hardening posssibilities in https://github.com/dcasota/photonos-scripts/wiki/Photon-OS-on-WSL2#hardening.

A good solution pattern should include

  1. antivirus interoperability
  2. windows 11 pro/.. : security updates, feature updates such as before 23H2 / with 23H2 and without KB5039212 / with KB5039212
  3. hyper-V release version capabilities and firewall settings
  4. ipv4 only / ipv6only / both
  5. wsl release e.g. 2.1.5 / 2.2.4
  6. networkingMode=nat / networkingMode=mirrored
  7. various adapters wired / wlan / usb-c ethernet adapters
  8. extended ethernet functions: vlan, vpn, bonds
  9. distro release version capabilities: systemd, nvidia

Hence, I would say there is a solution regression gap somewhere between 4-8.

It's a pity that there is no open-source joint-venture with VMware By Broadcom. Planning together existing and future virtual hardware capabilities (device firmware, kernel, virtual devices, drivers, default settings) for x86_64 and arm64 would be helpful for terrestric edge and datacenter solutions. Network configuration management, network setup, networking event brokering - this is NOT solved in cbl-mariner and therefore not in wsl. Unfortunately, without more features inside the network stack, answers in situations "WSL no internet connection / DNS issues" seems to end with a learning curve. Users hate this. It does not help.

The software delivery method in Windows through .mum packages seems to work. Facing the 50++ .mum package entries with the naming schema ~~~~, in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\Packages is quite challenging to maintain in business as usual bug situations, probably for Windows developers, but in any case for customers.

b-a-merritt commented 4 months ago

The issue only appeared on some types of WiFi networks. Mobile hotspots and public networks worked well, but the network at my workplace did not. Setting networkingMode=mirrored fixed all my issues.

jackrdye commented 4 months ago

The issue only appeared on some types of WiFi networks. Mobile hotspots and public networks worked well, but the network at my workplace did not. Setting networkingMode=mirrored fixed all my issues.

It is bizarre, on my wifi network wsl2 internet doesn't work. I switch to mobile hotspot without a restart or any change to wsl2 and it instantly works.

chanpreetdhanjal commented 4 months ago

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

cyberjj999 commented 4 months ago

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

Hi @chanpreetdhanjal . The logs are already included above. P.S. trying the solutions others propose here have not worked for my setup...

AnonymousWP commented 4 months ago

Why is Microsoft not fixing this? It's clearly a bug/issue, but no word from them.

chanpreetdhanjal commented 4 months ago

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

Hi @chanpreetdhanjal . The logs are already included above. P.S. trying the solutions others propose here have not worked for my setup...

We need wsl networking logs. Please follow the instructions following the link I shared above and share WSL Networking logs. Thanks!

kohlerdominik commented 3 months ago

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

Hi @chanpreetdhanjal . The logs are already included above. P.S. trying the solutions others propose here have not worked for my setup...

We need wsl networking logs. Please follow the instructions following the link I shared above and share WSL Networking logs. Thanks!

@chanpreetdhanjal See my first comment:

I have the exact same issue, started today as well.

WslLogs-2024-06-14_17-43-54.zip

NontasBak commented 3 months ago

Issue still exists unfortunately, internet works 50% of the time. ping google.com outputs ping: google.com: Temporary failure in name resolution. Sometimes it works correctly, sometimes it doesn't. I have to wait a few minutes for it to work again.

CatalinFetoiu commented 3 months ago

Hi, please collect WslNetworkingLogs for this issue (using https://github.com/microsoft/WSL/blob/master/diagnostics/collect-networking-logs.ps1)

does enabling DNS tunneling fix the problem? It is supported on 22621 Windows builds (please make sure to have the latest updates)

thanks

1MLightyears commented 3 months ago

Hi, please collect WslNetworkingLogs for this issue (using https://github.com/microsoft/WSL/blob/master/diagnostics/collect-networking-logs.ps1)

does enabling DNS tunneling fix the problem? It is supported on 22621 Windows builds (please make sure to have the latest updates)

thanks

Hi, the same issue happens to me as well. Logs attached. WslNetworkingLogs-2024-08-09_15-45-28.zip But, the collect-networking-logs.ps1 automatically exits before I could reproduce and press any key. Therefore I'm not sure what it has collected.

Another additional information from dmesg, about which of the system service had failed

[FAILED] Failed to start Dispatcher daemon for systemd-networkd.
See 'systemctl status networkd-dispatcher.service' for details.

result of journalctl:

Aug 09 15:39:22 LIGHTYEARS-CENTER networkd-dispatcher[1708]: Traceback (most recent call last):
Aug 09 15:39:22 LIGHTYEARS-CENTER networkd-dispatcher[1708]:   File "/usr/bin/networkd-dispatcher", line 30, in <module>
Aug 09 15:39:22 LIGHTYEARS-CENTER networkd-dispatcher[1708]:     import dbus
Aug 09 15:39:22 LIGHTYEARS-CENTER networkd-dispatcher[1708]: ModuleNotFoundError: No module named 'dbus'
Aug 09 15:39:22 LIGHTYEARS-CENTER systemd[1]: networkd-dispatcher.service: Main process exited, code=exited, status=1/FAILURE

(Though, I have installed dbus and manual runs of python -c "import dbus" and python3 -c "import dbus" succeeded, no idea why this happens)

Also I was having this:

Aug 09 15:01:07 LIGHTYEARS-CENTER systemd[1]: systemd-resolved.service: Start request repeated too quickly.
Aug 09 15:01:07 LIGHTYEARS-CENTER systemd[1]: systemd-resolved.service: Failed with result 'exit-code'.

Didn't find any further information. I have tried the .wslconfig, /etc/resolv.conf, wsl.conf, hns refresh, etc. and they just didn't work for my case.

CatalinFetoiu commented 3 months ago

@1MLightyears thanks for following up - sorry about the issues using the networking logs script, I don't recall seeing this recently does the issue with the script exiting early happen consistently?

1MLightyears commented 3 months ago

@1MLightyears thanks for following up - sorry about the issues using the networking logs script, I don't recall seeing this recently does the issue with the script exiting early happen consistently?

Thanks @CatalinFetoiu , I'm afraid that the answer is YES, it always automatically completes its job, packs the logs and exits, even with my hands off from keyboard and mouse. I tried twice again, once with wsl --shutdown executed before the powershell runs; the result remains the same.

Here is the last log zip I collected today: WslNetworkingLogs-2024-08-10_17-38-20.zip


I can answer any network setting problems directly, if not gathered by the logging. My Win11 and WSL(Ubuntu 22.04) are not behind any kinds of proxy, VPN, etc.

CatalinFetoiu commented 3 months ago

@1MLightyears thanks, I'll look at the logs to see if I can get a hint why the script exited early

to confirm, are the issues you are seeing DNS related? do you see the "Temporary failure in name resolution" errors others mentioned in the issue? if yes, let's confirm that DNS tunneling was succesfully enabled - what is the content of your /etc/resolv.conf file?

I recommend also enabling networkingMode=mirrored in your wslconfig file

1MLightyears commented 3 months ago

@1MLightyears thanks, I'll look at the logs to see if I can get a hint why the script exited early

to confirm, are the issues you are seeing DNS related? do you see the "Temporary failure in name resolution" errors others mentioned in the issue? if yes, let's confirm that DNS tunneling was succesfully enabled - what is the content of your /etc/resolv.conf file?

I recommend also enabling networkingMode=mirrored in your wslconfig file

Yes, I see the output Temporary failure in name resolution when using ping. I was quite sure that it should be a DNS problem as the domain name resolution service failed to start.

The content in/etc/resolv.conf is:

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 8.8.8.8

Besides, the mirror mode was tried before and it failed(the Temporary failure in name resolution error remains).

Wrong-Code commented 3 months ago

@1MLightyears, I suppose you have already tried what I've suggested in my previous post, especially this command (but please learn about it in context, if you haven't done it already):

Set-NetFirewallHyperVVMSetting -Name '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}' -DefaultOutboundAction Allow

I am using the official Windows Store's Debian 12 distro, with the latest patched WSL 2, on Windows 11 23H2 and with the Hyper-V role installed as well. Without the above command, I could only ping any address, but I could not reach any host (LAN or Internet) via TCP or UDP. With the rule in place, my issue was resolved Using WSL network mirror mode, BTW.

Can you ping your DNS host (assuming there is no firewall in your infrastructure blocking ICMP traffic)? Can you connect to other TCP/UDP service ports on other hosts, or your problem is just with DNS (UDP 53)?

1MLightyears commented 3 months ago

@1MLightyears, I suppose you have already tried what I've suggested in my previous post, especially this command (but please learn about it in context, if you haven't done it already):

Set-NetFirewallHyperVVMSetting -Name '{40E0AC32-46A5-438A-A0B2-2B479E8F2E90}' -DefaultOutboundAction Allow

I am using the official Windows Store's Debian 12 distro, with the latest patched WSL 2, on Windows 11 23H2 and with the Hyper-V role installed as well. Without the above command, I could only ping any address, but I could not reach any host (LAN or Internet) via TCP or UDP. With the rule in place, my issue was resolved Using WSL network mirror mode, BTW.

Can you ping your DNS host (assuming there is no firewall in your infrastructure blocking ICMP traffic)? Can you connect to other TCP/UDP service ports on other hosts, or your problem is just with DNS (UDP 53)?

Thank you @Wrong-Code , I'm afraid that I have tried that command. Here is what I got:

PS E:\> NetFirewallHyperVVMSetting

Name                  : {40E0AC32-46A5-438A-A0B2-2B479E8F2E90}
Enabled               : NotConfigured
DefaultInboundAction  : Allow
DefaultOutboundAction : Allow
LoopbackEnabled       : NotConfigured
AllowHostPolicyMerge  : NotConfigured

Inbound and Outbound are all Allow.


I can ping any IP address(ping 8.8.8.8 gives valid information) but I cannot ping any domain name(ping www.google.com gives ping: www.google.com: Temporary failure in name resolution).

torgeros commented 3 months ago

This should be related to https://github.com/microsoft/WSL/issues/11036, because the default resolv.conf is supposed to be inside /mnt/wsl (with a symlink to /etc/resolv.conf)

1MLightyears commented 3 months ago

This should be related to #11036, because the default resolv.conf is supposed to be inside /mnt/wsl (with a symlink to /etc/resolv.conf)

Thank you @torgeros , but it turned out not the issue, Temporary failure in name resolution remains. At first I found that the content in /mnt/wsl/resolv.conf is not changed. I made the symbolic link and after a reboot the content should be correct:

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 2001:4860:4860::8888
search modem

but ping still fails.

PS: I have added generateResolvConf = false in /etc/wsl.conf before but it seems not to have worked.


I will post my C:\Users\<my username>\.wslconfig here as well:

[wsl2]
firewall=false
debugConsole=false
networkingMode=mirrored
dnsTunneling=false
torgeros commented 3 months ago

Interesting, okay. Because for me the /mnt/wsl folder is completely empty. And then, obviously, resolv.conf is not there.

torgeros commented 3 months ago

Then maybe my issue #11928 is not exactly the same as this one...

CatalinFetoiu commented 3 months ago

@1MLightyears. thanks for following up

I see you are using 8.8.8.8 as DNS server. are you setting generateResolvConf to false in your /etc/wsl.conf file?

I recommend keeping generateResolvConf to the default true and enabling DNS tunneling. you can confirm DNS tunneling was succesfully enabled if you see "nameserver 10.255.255.254" in your /etc/resolv.conf file

1MLightyears commented 2 months ago

@CatalinFetoiu Sorry for late response, been busy with my work :(


I have changed my /etc/wsl.conf and /mnt/wsl/resolv.conf, now the content in /etc/wsl.conf is now automatically filled and is as follows:

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 10.255.255.254
search modem

Though, the result remains the same(ping: www.google.com: Temporary failure in name resolution). What else can I do?


PS: my .wslconfig had been modified to:

[wsl2]
firewall=false
debugConsole=false

PPS: I'm not sure that if my systemd-resolved service is the key to the problem: it is not active(output from journalctl -xeu systemd-resolved.service):

: systemd-resolved.service: Failed to execute /lib/systemd/systemd-resolved: Permission denied
: systemd-resolved.service: Failed at step EXEC spawning /lib/systemd/systemd-resolved: Permission denied

Though this file is actually 755.

CatalinFetoiu commented 2 months ago

@1MLightyears thanks. can you please collect a trace using the following commands? (If you no longer encounter the issue with the script exiting early you might use the script instead)

download https://github.com/microsoft/WSL/blob/master/diagnostics/wsl_networking.wprp wpr.exe -start .\wsl_networking.wprp -filemode in WSL, run ping google.com wpr.exe -stop .\logs.etl

collect and share logs.etl

1MLightyears commented 2 months ago

@CatalinFetoiu Hi! I'd like to provide some good news. I noticed your update commit on diagnostics/collect-networking-logs.ps1 yesterday and tried it again. It worked! Below is the log captured: WslLogs-2024-08-31_14-12-40.zip and I think you might prefer this.

CatalinFetoiu commented 2 months ago

@1MLightyears thanks. it looks like you attached a WslLogs zip, do you have a WslNetworkingLogs zip generated by the collect-networking-logs.ps1?

If you still encounter issues with collect-networking-logs.ps1, please try the wpr commands I shared