rancher-sandbox / rancher-desktop

Container Management and Kubernetes on the Desktop
https://rancherdesktop.io
Apache License 2.0
5.93k stars 280 forks source link

Rancher Desktop dashboard for local cluster stops working when `KUBECONFIG` environment variable is set #4149

Open gi-dorio opened 1 year ago

gi-dorio commented 1 year ago

Actual Behavior

I've been using Rancher Desktop both for local development and to use kubectl to connect and operate on a remote cluster via cli. To do so, I've added my remote cluster kubeconfig.yaml in the .kube folder and set a global environment variable called KUBECONFIG to point both at the already present config and at my custom kubeconfig. Doing this operation breaks the Rancher Desktop dashboard for the local cluster

Steps to Reproduce

  1. Add another kubeconfig in the .kube folder so that the folder path looks like this
     $HOME
       |-.kube
         |-cache
         |-config
         |-my-custom-kubeconfig.yaml
  2. Set a global environment variable like this: KUBECONFIG=$HOME\.kube\config;$HOME\.kube\my-custom-kubeconfig.yaml
  3. Start Rancher Desktop
  4. Open the dashboard

Result

When opening the dashboard for the local cluster, it gets stuck on infinite loading. The logs of the dashboard are as follows

2023-03-10T07:49:56.896Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.898Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.899Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.899Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:49:56.900Z: [HPM] Proxy created: /  -> https://127.0.0.1:9443
2023-03-10T07:51:49.604Z: Proxy Error: Error: connect ECONNREFUSED 127.0.0.1:9443
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1161:16) {
  errno: -4078,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 9443
}
2023-03-10T07:51:49.604Z: [HPM] Error occurred while proxying request 127.0.0.1:6120/api/v1/namespaces/cattle-ui-plugin-system/services/http:ui-plugin-operator:80/proxy/index.json to https://127.0.0.1:9443/ [ECONNREFUSED] (https://nodejs.org/api/errors.html#errors_common_system_errors)

Expected Behavior

The dashboard opens up and shows me the local cluster

Additional Information

I am behind a corporate proxy, but I don't think this should interfere. Plus, I get the same error by using another connection. I am still able to switch contexts and point both at the remote cluster and the local one via cli, the only thing that doesn't work is the dashboard.

If instead of adding the kubeconfig in the folder as a separate file I merge its contents with the config that's already there, obtaining something on the lines of

apiVersion: v1
kind: Config
clusters:
  - name: <my-remote-cluster>
    cluster:
      server: https://<remote-cluster-host>/k8s/clusters/<cluster-id>
      insecure-skip-tls-verify: false
  - name: rancher-desktop
    cluster:
      server: https://172.24.19.232:6443
      certificate-authority-data: <certificate-authority-data> 
      insecure-skip-tls-verify: false
users:
  - name: <my-remote-cluster>
    user:
      exec:
        apiVersion: client.authentication.k8s.io/v1beta1
        args:
          - token
          - --server=<remote-cluster-host>
          - --user=my-remote-cluster
          - --cluster=<cluster-id>
        command: rancher
        env: null
        interactiveMode: IfAvailable
        provideClusterInfo: false
  - name: rancher-desktop
    user:
      client-certificate-data: <client-certificate-data>
      client-key-data: <client-key-data> 
contexts:
  - name: <my-remote-cluster>
    context:
      cluster: <my-remote-cluster>
      name: <my-remote-cluster>
      user: <my-remote-cluster>
  - name: rancher-desktop
    context:
      cluster: rancher-desktop
      name: rancher-desktop
      user: rancher-desktop
preferences: {}
current-context: rancher-desktop

the dashboard works as expected and I can also operate on the remote cluster

Rancher Desktop Version

1.7.0

Rancher Desktop K8s Version

1.22.6

Which container engine are you using?

moby (docker cli)

What operating system are you using?

Windows

Operating System / Build Version

Windows 11 Enterprise, 10.0.22621, build 22621

What CPU architecture are you using?

x64

Linux only: what package format did you use to install Rancher Desktop?

None

Windows User Only

I am behind a corporate proxy when in the office and I use a VPN when I'm home. The problem persists even when I'm not in on of the two cases mentioned before (with standard home internet connection).

adamkpickering commented 1 year ago

I can reproduce this on Windows 10, without any VPN, proxy etc configured.

My steps to reproduce:

  1. Copy the kubeconfig created by RD to test-config.yaml in the same directory (~/.kube). Replace all instances of rancher-desktop in that file with test-rancher-desktop.
  2. In my user environment variables, set KUBECONFIG to %USERPROFILE%\.kube\config;%USERPROFILE%\.kube\test-config.yaml.
  3. Restart anything you want KUBECONFIG to be updated in.
  4. Test kubectl with kubectl config view. It should load properly.
  5. Start Rancher Desktop. Once it has fully started up, open dashboard. Note that it doesn't load.

@GiuseppeCSI you replaced $HOME in your KUBECONFIG with %USERPROFILE%, or something like that, right? I couldn't reproduce the problem (and kubectl didn't work) when I was using $HOME.

adamkpickering commented 1 year ago

@rak-phillip does this bring any potential causes to mind?

jandubois commented 1 year ago

does this bring any potential causes to mind?

I suspect that the dashboard does not support one or more of these things:

The whole KUBECONFIG handling depends on the kube client library being used.

rak-phillip commented 1 year ago

Following the repro steps produces this while starting steve

.\steve.exe
time="2023-03-10T14:26:02-08:00" level=fatal msg="error loading config file \"C:\\Users\\phillip\\.kube\\config;C:\\Users\\phillip\\.kube\\my-custom-kube-config.yaml\": open C:\\Users\\phillip\\.kube\\config;C:\\Users\\phillip\\.kube\\my-custom-kube-config.yaml: The filename, directory name, or volume label syntax is incorrect."
gi-dorio commented 1 year ago

@adamkpickering I can confirm that I was using $HOME, but i guess it depends on what shell you are using. I use Powershell, so env variables are configured with $SOMETHING, while in cmd they are configured as %SOMETHING%. Don't quote me on this though, I've never had much experience with Windows shells, I switched to them recently.

@jandubois So, out of curiosity, how does the dashboard work in this regard? the normal kubectl cli has no problem accepting KUBECONFIG with an env variable inside and the ;, so what does the dashboard do differently? More than that, isn't it weird that the logs give back a proxy error?

rak-phillip commented 1 year ago

Two components allow the dashboard feature to run in Rancher Desktop, Steve (the backend API) and a custom server that exists to serve the Dashboard UI and proxy requests. This implementation is similar to how everything behaves in a production deployment of Rancher Manager, but customized for Rancher Desktop.

The proxy errors might be misleading, but they reveal that the connection to Steve is refused. In this case, Steve never started because it doesn't handle the case of parsing multiple configs on Windows (possibly other environments as well, needs validation). Hopefully, this explains why we might see discrepancies between Dashboard and kubectl.

taz77 commented 9 months ago

This is related to this issue, the root problem is how Rancher handles your kubeconfigs

https://github.com/rancher-sandbox/rancher-desktop/issues/3216