microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.36k stars 816 forks source link

WslRegisterDistribution failed with error: 0xffffffff #4364

Closed alexey-gusarov closed 8 months ago

alexey-gusarov commented 5 years ago

Please fill out the below information:

PS C:\WINDOWS\system32> wsl --list Windows Subsystem for Linux Distributions: Ubuntu (Default)

PS C:\WINDOWS\system32> wsl --list --verbose NAME STATE VERSION

PS C:\WINDOWS\system32> WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

FeatureName : Microsoft-Windows-Subsystem-Linux DisplayName : Windows Subsystem for Linux Description : Provides services and environments for running native user-mode Linux shells and tools on Windows. RestartRequired : Possible State : Enabled CustomProperties : ServerComponent\Description : Provides services and environments for running native user-mode Linux shells and tools on Windows. ServerComponent\DisplayName : Windows Subsystem for Linux ServerComponent\Id : 1033 ServerComponent\Type : Feature ServerComponent\UniqueName : Microsoft-Windows-Subsystem-Linux ServerComponent\Deploys\Update\Name : Microsoft-Windows-Subsystem-Linux

PS C:\WINDOWS\system32> wsl --unregister ubuntu Unregistering...

PS C:\WINDOWS\system32> ubuntu Installing, this may take a few minutes... WslRegisterDistribution failed with error: 0xffffffff Error: 0xffffffff (null) Press any key to continue...

See our contributing instructions for assistance.

wsl.zip

wclr commented 4 years ago

To start docker need to stop may DNS proxy on 53. After it started, start DNS Proxy and may restart docker without a problem. That's weird. Service on 53 should not affect work this way.

kmahyyg commented 4 years ago

I'm using Hyper-V for my personal VM and also for WSL. However, due to the bridge network, Shared Access (ICS, internet connection share) Service running via svchost.exe -k netsvcs -p -s SharedAccess will bind to 53/udp and cannot unbind due to my personal VM need bridging network.

So I'm so confusing, why are you SO EXCITED TO BIND PORT 53/UDP? Just reuse the local connection DNS should be okay, since WSL is NAT networking.

Finally, this blocked me from using WSL 2. So, Switched to Hyper-V based WSL, blocked by Hyper-V. You really need a new software QA department.

Update: Switch back to WSL 1 will work.

woss commented 4 years ago

I am getting this error every time i have Acrylic DNS proxy running, i'd need the custom domains on windows AND wsl, so i went with it https://mayakron.altervista.org/support/acrylic/Home.htm now not sure how to accomplish the same thing since i need to turn it off then WSL2 works

doomy commented 4 years ago

I can confirm having the same issue. There is a collision of port 53 usage between Acrylic and Windows ICS service (connection sharing) which seems to provide connectivity to WSL2.

Setting Acrylic to different port seems to render it unusable for regular Windows use. Windows only sees it on port 53.

I'm baffled, as I'm sure I had no problem with WSL2 and Acrylic coexisting in the past, on a previous machine. And as my current workflow pretty much requires both to work simultaneously, a solution which could achieve that would be a lifesaver!

rickywu commented 4 years ago

I run CoreDNS which use port 53 and cause WSL 2 start failed, use this bat script to start WSL first and start CoreDNS after that, both of them works well, so I think WSL just use port 53 when startup

wsl -e exit d:\Path\CoreDNS.exe

fishsugar commented 4 years ago

In my case it seems someone monitoring the ICS (SharedAccess) service, when the ics process killed, it starts again. So I change the service account to a new account, then kill the service process successfully, change the service acount back, try install ubuntu again, and it works.

mxmp210 commented 4 years ago

Up-voting this as it's painful to kill services manually and restart them, ICS indeed keeps binding to DNS port and doesn't allow other programs to run. If anybody has fix with using custom DNS server on same machine, please share! This issue also appears when trying to convert WSL1 to WSL2 VM, because it creates vEthernet and in order to share connectivity, kernel launches this bridge service.

ekoeryanto commented 4 years ago

I fixed the issue with uninstalling Memu android emulator

GoFightNguyen commented 4 years ago

I have also confirmed that another service, Umbrella Roaming Client, listening on port 53 caused the problems for me too. After changing the service startup of that service to Automatic (Delayed Start) as suggested in this comment, my issue was resolved

GoFightNguyen commented 4 years ago

I also experienced this issue without anything listening on port 53. This time the problem was the LxssManager service not running.

ghost commented 4 years ago

Same problem for me as a result of a company mandated Cisco AnyConnect Secure Mobility Client, with 'Roaming Security' that encrypts DNS queries. I've had to convert my Ubuntu instance back to WSL version 1 in order to use both Cisco AnyConnect and WSL.

MrObvious commented 4 years ago

Hello. Any ETAs on this?

omegaht commented 4 years ago

Same problem for me as a result of a company mandated Cisco AnyConnect Secure Mobility Client, with 'Roaming Security' that encrypts DNS queries. I've had to convert my Ubuntu instance back to WSL version 1 in order to use both Cisco AnyConnect and WSL.

same Issue here.

christianfosli commented 4 years ago

The work-around to quit processes using port 53 just before launching WSL worked for me, but this also caused DNS to break inside WSL, so it's kind of a painful work-around.

markwu commented 4 years ago

I can confirm this bug, after removing Acrylic DNS Proxy, WSL back agagin.

I need Acrylic DNS Proxy for route all *.test domain to 127.0.0.1, it is easier for my life when do web development. Now, I have to add dns entry one by one to hosts file for this purpose.

pjbirch commented 4 years ago

Same problem for me as a result of a company mandated Cisco AnyConnect Secure Mobility Client, with 'Roaming Security' that encrypts DNS queries. I've had to convert my Ubuntu instance back to WSL version 1 in order to use both Cisco AnyConnect and WSL.

Same problem here too.

halfdeadcat commented 4 years ago

It took me way too long to find that WSL2 is incompatible with Cisco Umbrella's (or any other) DNS proxy starting first. Why is it taking so long to fix this? Add up all the lost productivity of the people just in this thread.

barkermn01 commented 4 years ago

I'm not running anything on port 53 I have checked with netstat, running windows 10 pro-1909 and I get this tried uninstalling and reinstalling hyper-v and WSL several times and I always get this error.

garydgregory commented 4 years ago

Ran into this blocker issue with Cisco's C:\Program Files (x86)\OpenDNS\Umbrella Roaming Client\dnscrypt-proxy.exe

kmahyyg commented 4 years ago

A year from issue opened, M$FT seems have no indication to trying to solve this problem.

barkermn01 commented 4 years ago

So upgraded windows 10 to 2004 version, set the default to WSL 1, installed Ubuntu tried to convert it to Version 2 and same error I don't know what this is but WSL2 is bugged as hell with networking for some reason, I know because I can change the error if in Hyper-V i create the network adapter WSL it changes the error, but the moment I remove it back to 0xffffffff

pjbirch commented 4 years ago

Recently upgraded to Cisco Anyconnect SMC v4.9.01095 and this seems to have fixed the issue for me. I no longer need to disable the Cisco Umbrella service, and I can use WSL2 without any issues.

devbeard commented 4 years ago

I have had issues at work with WSL2 because of AnyConnect/Umbrella. Not sure what has changed, but no longer. I can freely convert between wsl1 and 2, have networking with wsl2. Everything works (finally!).

What could have changed? Well, we had updates rolled out to our Cisco suite last week. Also, there were some KBs recently that fixed "issues with wsl" without further description. We also changed our VPN profile to allow access to some local resources like specific printers. In conclusion, too many variables have changed to conclude, but if anyone wants me to check something, just ask.

ghost commented 4 years ago

Recently upgraded to Cisco Anyconnect SMC v4.9.01095 and this seems to have fixed the issue for me. I no longer need to disable the Cisco Umbrella service, and I can use WSL2 without any issues.

I just tried upgraded to 4.9.01095 but still facing the same issue. If I run "get-process -id (Get-NetTCPConnection -localport 53).OwningProcess" it's still bound to dnscrypt-proxy, trying to figure out what's different...

CBronkhorst commented 4 years ago

I disabled Hyper-V network adapter and could install Ubuntu-20.04 App, WSL 2 was the default version as I didn't switch to WSL 1.

KineticTheory commented 4 years ago

Here's another data point. If I log in as "Administrator" on my laptop, I can set the WSL version to 2 via powershell (wsl --set-version Ubuntu-20.04 2). But if I log in as a regular user, I get the -1 error:

PS C:\Users\bob> wsl --set-version Ubuntu-20.04 2
Conversion in progress, this may take a few minutes...
For information on key differences with WSL 2 please visit https://aka.ms/wsl2
Error: 0xffffffff

This looks more like a permissions issue, but I'm not sure how to diagnose further.

barkermn01 commented 4 years ago

Here's another data point. If I log in as "Administrator" on my laptop, I can set the WSL version to 2 via powershell (wsl --set-version Ubuntu-20.04 2). But if I log in as a regular user, I get the -1 error:

PS C:\Users\bob> wsl --set-version Ubuntu-20.04 2
Conversion in progress, this may take a few minutes...
For information on key differences with WSL 2 please visit https://aka.ms/wsl2
Error: 0xffffffff

This looks more like a permissions issue, but I'm not sure how to diagnose further.

That's strange because I have tried running the command though PowerShell in Windows Terminal Running as admin and get the same problem

montao commented 4 years ago

Got this error just now, might be docker related. WSL2 error Untitled

montao commented 4 years ago

Solved after starting elevated powershell and netcfg -d, restart and it's good again.

zgjimgjonbalaj commented 4 years ago

Solved after starting elevated powershell and netcfg -d, restart and it's good again.

That did nothing for me, as long as I have Acrylic DNS running from valet-windows, WSL 2 will keep producing a bootup error. If I stop these services I can run WSL with no issues.

This bug/feature has been a show-stopper for me since I prefer to run MySQL, PHP & various other tools on Win and manage using scoop. This is mainly for performance reasons and so that I don't have to run a distro to get MySQL, PHP & Nginx running. I know that on boot all my services are running and I can get to code faster.

I use WSL for the various tools and libraries that don't work so well on Win or simply haven't been ported over and/or the ported versions are watered down. This is now forcing me to stop the very services I am complimenting with WSL and thus rendering WSL almost useless.

With that said I have tried to migrate some of these services to WSL and avoid using valet-windows altogether, even looking for alternative options with no luck. Personally, I think it would be much better time spent on resolving these issues rather than working on getting a file manager GUI on WSL running.

zgjimgjonbalaj commented 4 years ago

I have also confirmed that another service, Umbrella Roaming Client, listening on port 53 caused the problems for me too. After changing the service startup of that service to Automatic (Delayed Start) as suggested in this comment, my issue was resolved

That has worked for me for Acrylic DNS also

EDIT: See comment below, this does not work.

rohanrhu commented 4 years ago

Is there any solution for this?

rohanrhu commented 4 years ago

I have also confirmed that another service, Umbrella Roaming Client, listening on port 53 caused the problems for me too. After changing the service startup of that service to Automatic (Delayed Start) as suggested in this comment, my issue was resolved

It worked when i stop my DNS server. (port 53)

pjbirch commented 3 years ago

Recently upgraded to Cisco Anyconnect SMC v4.9.01095 and this seems to have fixed the issue for me. I no longer need to disable the Cisco Umbrella service, and I can use WSL2 without any issues.

I just tried upgraded to 4.9.01095 but still facing the same issue. If I run "get-process -id (Get-NetTCPConnection -localport 53).OwningProcess" it's still bound to dnscrypt-proxy, trying to figure out what's different...

Previously, I had to disable the Cisco Umbrella Service, in order to run WSL and convert the VMs to WSL V2. Now it all seems to be working.

Admittedly, I think I had already converted my WSL to V2 (while the old Umbrella service was disabled) before I upgraded to the newer Cisco version, I haven't tried to convert a WSL vm to V2 since the update.
But I am able to now run WSL while Umbrella is working, which I wasn't able to before.

CiscoUmbrellaWSL

markwu commented 3 years ago

I have also confirmed that another service, Umbrella Roaming Client, listening on port 53 caused the problems for me too. After changing the service startup of that service to Automatic (Delayed Start) as suggested in this comment, my issue was resolved

That has worked for me for Acrylic DNS also

@zgjimgjonbalaj Does Acrylic DNS also work after Automatic (Delayed Start)?

zgjimgjonbalaj commented 3 years ago

@markwu - Meant to update my comment previously but unfortunately no. The delayed start it seems is not some magic solution it simply is delaying the start of the service during windows startup giving you the illusion of WSL 2 working (if you happen to launch WSL 2 before Acrylic does). As soon as that service has started WSL2 will fail again with the same code.

I have not tested other DNS Proxy Servers but managed to get WSL 2 to launch while CoreDNS is also running. There is no automatic implementation with CoreDNS as with Valet & Acrylic but what that confirmed for me there was that there may be an incompatibility with how Acrylic handles its wildcard DNS routing and WSL 2 since a DNS server like CoreDNS seems to work just fine.

This is basically going to turn into a Acrylic/DNS Proxies vs WSL 2, willing to bet we wont get this fixed and have to look to alternate solutions.

markwu commented 3 years ago

@zgjimgjonbalaj Thanks for your detail reply.

pjbirch commented 3 years ago

I may have been premature with my last post about Cisco 4.9.01095 fixing my problems. It did seem to fix my WSL starting and upgrading to V2 issues, however, I lose DNS resolution in WSL when connected to the VPN.

ghost commented 3 years ago

I may have been premature with my last post about Cisco 4.9.01095 fixing my problems. It did seem to fix my WSL starting and upgrading to V2 issues, however, I lose DNS resolution in WSL when connected to the VPN.

Yeah it's been hit and miss for me too. Numerous DNS and network issues. I'm using an Ubuntu virtual machine on my laptop for the time being until some of these issues are resolved. I've lost too much time trying to get this working.

garydgregory commented 3 years ago

Cisco AnyClient 4.9.01095 did not help for me.

mchubby commented 3 years ago

I don't know what changed, but in 2004 Pro, build 19041.572 the WSL vSwitch has been created without issue, freshly after PC restart. UDP 53 is still in use by SharedAccess and both WSL and Hyper-V work along.

Recent applied hotfixes are:

error events still get logged in H-N-S, but everything seems to work nonetheless. Quite stumped about it.

Pinche-Dev commented 3 years ago

DO THIS WORKED FOR ME: - netstat -a -b finds out which process is listening on a port on Windows then with Task manager KILL the process running on ::53 (port 53 usually ArcylicDNS...) then start wsl

jawabuu commented 3 years ago

Here's how you can use Acrylic DNS with Docker for Win and WSL2

  1. Change Acrylic DNS Configuration to stop listening on port 53. WSL2 will not run if port 53 is not free!! acrylic-configuration I have used port 54 in my setup.
  2. In Acrylic DNS add a wildcard domain of your choice to route to 127.0.0.1 acrylic-hosts

I am using docker.local in my setup so [anything].docker.local will always resolve to 127.0.0.1.

  1. Go to Network and Sharing Center > Change Adapter Settings Right Click on your Network and select Properties Make sure your IPV4 DNS settings match this ipv4-settings

Make sure your IPV6 DNS settings match this ipv6-settings

Start Powershell

Verify both Acrylic DNS and Docker Desktop are running and bound to the correct ports Get-NetUDPEndpoint | Where {$_.LocalPort -eq "53"} | select LocalAddress,LocalPort,@{Name="Process";Expression={(Get-Process -Id $_.OwningProcess).ProcessName}}

Get-NetUDPEndpoint | Where {$_.LocalPort -eq "54"} | select LocalAddress,LocalPort,@{Name="Process";Expression={(Get-Process -Id $_.OwningProcess).ProcessName}}

powershell-show-ports

Verify DNS resolution works fine verify

The trick here is running Acrylic (or any other DNS proxy server) over IPV6 which WSL2 does not bind to. Alternatively, if you are able to make the Windows DNSClient query a custom port other than 53 for DNS queries you should be able to use a local DNS proxy alongside WSL2.

Maybe in registry entry [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\DNS\Parameters] "SendOnNonDnsPort"=dword:000014e9

Bonus Create trusted certificates for your local domain

go get -u github.com/FiloSottile/mkcert
mkcert -install
mkcert "*.docker.local" docker.local localhost 127.0.0.1 ::1

create-cert

amirdaraee commented 3 years ago

My problem was with the Technitium DNS Server, after stopping the service the issue was gone.

SmallPackage commented 3 years ago

DO THIS WORKED FOR ME: - netstat -a -b finds out which process is listening on a port on Windows then with Task manager KILL the process running on ::53 (port 53 usually ArcylicDNS...) then start wsl

I had error 0xffffffff, and your advise really help ; Stop ArcylicDNS Servrice, and porblem sovled;

tlonny commented 3 years ago

Here's how you can use Acrylic DNS with Docker for Win and WSL2

  1. Change Acrylic DNS Configuration to stop listening on port 53. WSL2 will not run if port 53 is not free!! acrylic-configuration I have used port 54 in my setup.
  2. In Acrylic DNS add a wildcard domain of your choice to route to 127.0.0.1 acrylic-hosts

I am using docker.local in my setup so [anything].docker.local will always resolve to 127.0.0.1.

  1. Go to Network and Sharing Center > Change Adapter Settings Right Click on your Network and select Properties Make sure your IPV4 DNS settings match this ipv4-settings

Make sure your IPV6 DNS settings match this ipv6-settings

Start Powershell

Verify both Acrylic DNS and Docker Desktop are running and bound to the correct ports Get-NetUDPEndpoint | Where {$_.LocalPort -eq "53"} | select LocalAddress,LocalPort,@{Name="Process";Expression={(Get-Process -Id $_.OwningProcess).ProcessName}}

Get-NetUDPEndpoint | Where {$_.LocalPort -eq "54"} | select LocalAddress,LocalPort,@{Name="Process";Expression={(Get-Process -Id $_.OwningProcess).ProcessName}}

powershell-show-ports

Verify DNS resolution works fine verify

The trick here is running Acrylic (or any other DNS proxy server) over IPV6 which WSL2 does not bind to. Alternatively, if you are able to make the Windows DNSClient query a custom port other than 53 for DNS queries you should be able to use a local DNS proxy alongside WSL2.

Maybe in registry entry [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\DNS\Parameters] "SendOnNonDnsPort"=dword:000014e9

Bonus Create trusted certificates for your local domain

go get -u github.com/FiloSottile/mkcert
mkcert -install
mkcert "*.docker.local" docker.local localhost 127.0.0.1 ::1

create-cert

You bloody legend

jawabuu commented 3 years ago

@tlonny Glad it helped.

VPraharsha03 commented 3 years ago

@jawabuu that kills the DNS resolution, DNS leak tests fail.

jawabuu commented 3 years ago

Hey @VPraharsha03 Could you share your nslookup results as in the image I posted?

chx commented 3 years ago

I have the same error. Nothing on port 53:

get-process -id (Get-NetTCPConnection -localport 53).OwningProcess
Get-NetTCPConnection : No MSFT_NetTCPConnection objects found with property 'LocalPort' equal to '53'.  Verify the value of the property and retry.

get-netadapter|where-object {$_.interfacedescription -like "*hyper-v*"} comes back empty.

The event log only has successful messsages from Hyper V:

The operation 'Create' succeeded on nic 318A01AF-D321-4B67-918B-3C22697D8154 (Friendly Name: WSL).
NIC 318A01AF-D321-4B67-918B-3C22697D8154 (Friendly Name: WSL) successfully connected to port 318A01AF-D321-4B67-918B-3C22697D8154 (Friendly Name: WSL) on switch 318A01AF-D321-4B67-918B-3C22697D8154(Friendly Name: WSL).
.... snip
The operation 'Delete' succeeded on nic 318A01AF-D321-4B67-918B-3C22697D8154 (Friendly Name: WSL).

Tried a netstat -d and rebooted. I stopped anyconnect and openvpn services. I did https://social.technet.microsoft.com/Forums/en-US/ee5b1d6b-09e2-49f3-a52c-820aafc316f9/hyperv-doesnt-work-after-upgrade-to-windows-10-1809?forum=win10itprovirt.

WSL1 is fine.