rancher-sandbox / rancher-desktop

Container Management and Kubernetes on the Desktop
https://rancherdesktop.io
Apache License 2.0
5.98k stars 283 forks source link

Kubernetes fails to start #7009

Closed ticktockhouse closed 4 months ago

ticktockhouse commented 4 months ago

Actual Behavior

Kubernetes fails to start on Windows 11

Steps to Reproduce

Result

Kubernetes fails to start with the error:

2024-06-06T14:39:51.806Z: Registered distributions: Ubuntu,rancher-desktop
2024-06-06T14:39:51.806Z: Creating initial data distribution...
2024-06-06T14:40:15.701Z: Did not find a valid mount, mounting /mnt/wsl/rancher-desktop/run/data
2024-06-06T14:40:19.674Z: Installing C:\Program Files\Rancher Desktop\resources\resources\linux\internal\trivy as /mnt/c/Program Files/Rancher Desktop/resources/resources/linux/internal/trivy into /usr/local/bin/trivy ...
2024-06-06T14:40:21.215Z: Installing C:\Program Files\Rancher Desktop\resources\resources\linux\internal\rancher-desktop-guestagent as /mnt/c/Program Files/Rancher Desktop/resources/resources/linux/internal/rancher-desktop-guestagent into /usr/local/bin//rancher-desktop-guestagent ...
2024-06-06T14:40:28.980Z: WSL: executing: busybox readlink -f /etc/docker/daemon.json: Error: wsl.exe exited with code 1

2024-06-06T14:40:37.686Z: WSL: executing: cat /root/.docker/config.json: Error: wsl.exe exited with code 1

Expected Behavior

Kubernetes should start with no issues

Additional Information

No response

Rancher Desktop Version

1.13.1

Rancher Desktop K8s Version

1.29.5

Which container engine are you using?

containerd (nerdctl)

What operating system are you using?

Windows

Operating System / Build Version

Windows 11 Version 23H2 (OS Build 22631.3593)

What CPU architecture are you using?

x64

Linux only: what package format did you use to install Rancher Desktop?

None

Windows User Only

AWS VPN (not running)

mook-as commented 4 months ago

Hi! Unfortunately, the logs you haven't don't indicate what the issue might be; the readlink and cat errors are benign (they just indicate that existing configuration doesn't exist, which makes sense after a factory reset).

I think you can close the error dialog at that point and still access the Show Logs button; if that's not possible, please manually examine C:\Users\<name>\AppData\Local\rancher-desktop\logs for them. Please review for private information and then attach them to the bug, thanks!

ticktockhouse commented 4 months ago

Please find log files attached

The error that shows up in th UI appears to come from wsl.log

wsl-helper.Ubuntu.log k3s.log host-resolver-peer.log vtunnel-peer.log wsl-init.log wsl.log host-resolver-host.log vtunnel-host.log wsl-exec.log

mook-as commented 4 months ago

Hi! Unfortunately, the logs you posted did not include the relevant information — please attach background.log in particular.

Please enable debug logging, restart Rancher Desktop, and then attach all of the logs? Thanks!

ticktockhouse commented 4 months ago

Please find the new files attached. It seems it's only possible to enable debug logging after a factory reset after the fact (i.e. it's not possible to enable debug logging and then start the initialisation process), so I hope everything that is needed has been captured:

wsl.log background.log wsl-helper.Ubuntu.log rancher-desktop-guestagent.log wsl-exec.log vtunnel-host.log host-resolver-host.log dashboardServer.log wsl-init.log shortcuts.log update.log k8s.log

mook-as commented 4 months ago

Thanks for the new logs! background.log says:

2024-06-10T08:21:28.215Z: Kubernetes was unable to start: Error: connect ETIMEDOUT 172.31.23.23:6443
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16) {
  errno: -4039,
  code: 'ETIMEDOUT',
  syscall: 'connect',
  address: '172.31.23.23',
  port: 6443
}

And we can see related logs in k8s.log:


2024-06-10T08:13:36.872Z: Waiting for K3s server to be ready on port 6443...
2024-06-10T08:13:58.899Z: Error: connect ETIMEDOUT 172.31.23.23:6443
…
2024-06-10T08:14:22.243Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:14:46.285Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:15:10.337Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:15:34.389Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:15:58.443Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:16:22.494Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:16:46.536Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:17:10.584Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:17:34.639Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:17:58.672Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:18:22.704Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:18:46.767Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:19:10.822Z: Error fetching services: Error: connect ETIMEDOUT 172.31.23.23:6443
2024-06-10T08:19:13.835Z: Waited more than 300 secs for kubernetes to fully start up. Giving up.

Please see if turning on tunnelled networking works for you: either start Rancher Desktop from the command line, with rdctl start --experimental.virtual-machine.networking-tunnel --application.debug (these settings persist afterwards), or upgrade to Rancher Desktop 1.14 where it's the default.

ticktockhouse commented 4 months ago

I enabled "enable netowkring tunnel" Under Preferences -> WSL -> Network, which seemed to have the same effect. I certainly don't have the startup errors any more, just not quite sure what to do next :)

mook-as commented 4 months ago

Yes, that flips the same setting under the hood.

When you say you don't have the startup errors any more, does that mean things are now working correctly? Or are the still silently broken? (I'm assuming here that you confirmed things by restarting Rancher Desktop too…)

ticktockhouse commented 4 months ago

Yup, I used nerdctl to start a rancher/rancher container. Is this a sufficient sanity check? Is there an easy way to confirm that kubernetes is running?

mook-as commented 4 months ago

The easiest is just kubectl get pods -A to see if anything comes back (there would normally be some internal ones at least), or you can try kubectl cluster-info to see an overview.

ticktockhouse commented 4 months ago

Thanks for all your assistance, it's now working \0/