derailed / k9s

🐶 Kubernetes CLI To Manage Your Clusters In Style!
https://k9scli.io
Apache License 2.0
26.43k stars 1.65k forks source link

"Plugins load failed!" when no current context #2651

Open dewe opened 5 months ago

dewe commented 5 months ago




Describe the bug When starting k9s without a current context set, there's an error telling me that Plugins load failed!. But when having a current context set, or specifying the --context flag when starting k9s, it works as expected. I have no plugins.

To Reproduce

$ kubectl config unset current-context
Property "current-context" unset.

$ k9s
...
Screenshot 2024-04-04 at 09 15 44



Historical Documents

k9s logs:

9:25AM INF 🐶 K9s starting up...
9:25AM ERR Fail to locate metrics-server error="Get \"http://localhost:8080/api\": dial tcp [::1]:8080: connect: connection refused"
9:25AM ERR config refine failed error="unable to activate context \"\": getcontext - invalid context specified: \"\""
9:25AM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused"
9:25AM INF ✅ Kubernetes connectivity
9:25AM WRN Save failed. no active config detected
9:25AM ERR Fail to load global/context configuration error="Get \"http://localhost:8080/api\": dial tcp [::1]:8080: connect: connection refused\nunable to activate context \"\": getcontext - invalid context specified: \"\"\ncannot connect to context: \nk8s connection failed for context: "
9:25AM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\""
9:25AM ERR Load cluster resources - No API server connection
9:25AM ERR failed to list contexts error="no connection"
9:25AM WRN Unable to dial discovery API error="no connection to dial"
9:25AM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\""
9:25AM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\""
9:25AM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\""
9:25AM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\""
9:25AM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused"
9:25AM ERR Load cluster resources - No API server connection
9:25AM WRN Unable to dial discovery API error="no connection to dial"
9:25AM WRN Plugins load failed: getcontext - invalid context specified: ""
9:25AM WRN Plugins load failed: getcontext - invalid context specified: ""
9:25AM WRN Plugins load failed: getcontext - invalid context specified: ""
9:26AM ERR failed to list namespaces error="user not authorized to list all namespaces"
9:26AM WRN Save failed. no active config detected
9:26AM ERR nuking k9s shell pod error="getcontext - invalid context specified: \"\""

Expected behavior I'll end up in the k9s context meny with no error.

Versions (please complete the following information):

emilkor1 commented 5 months ago

Happens to me as well, both with and without plugins. It seems to block me from choosing context for ~5 seconds with the following error: "😡 no connection to cached dial". Does not happen with k9s --context ....

wazazaby commented 5 months ago

Encountering the exact same problem as Emil, on K9s v0.32.4 and K8s v1.29.1.

michaelfich commented 4 months ago

I've also been encountering this issue. If I set a context explicitly prior to opening k9s, it's fine but I cannot switch to another context while in k9s without it breaking.

jtnz commented 4 months ago

Another way to reproduce this is:

KUBECONFIG='' k9s

Not only does this issue prevent you from selecting a context, but is also the cause of another long standing issue I've had that k9s will exit if you don't select a context fast enough.

We don't want to set a context (current-context: "") as we have many different k8s clusters and don't want any tooling connecting to one by default. We use the following as our ~/.kube/config file, as generated by:

$ KUBECONFIG='' kubectl config view
apiVersion: v1
clusters: null
contexts: null
current-context: ""
kind: Config
preferences: {}
users: null

We thenhave a standalone config file per cluster, joined together in $KUBECONFIG, e.g.

$ echo $KUBECONFIG
/home/foo/.kube/config:/home/foo/.kube/cluster-1:/home/foo/.kube/cluster-2:/home/foo/.kube/cluster-3

This all works 100% fine with kubectl --context context-1.

Also tested by wiping out ~/.local/share/k9s, ~/.local/state/k9s, and ~/.config/k9s.

Here's the full logs (KUBECONFIG='' k9s -l debug):

``` 4:14PM INF 🐶 K9s starting up... 4:14PM ERR Fail to locate metrics-server error="Get \"http://localhost:8080/api\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR config refine failed error="unable to activate context \"\": getcontext - invalid context specified: \"\"" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM INF ✅ Kubernetes connectivity 4:14PM WRN Save failed. no active config detected 4:14PM ERR Fail to load global/context configuration error="Get \"http://localhost:8080/api\": dial tcp [::1]:8080: connect: connection refused\nunable to activate context \"\": getcontext - invalid context specified: \"\"\ncannot connect to context: \nk8s connection failed for context: " 4:14PM DBG [Skin] Loading global skin ("skin") 4:14PM DBG Loading skin file: "/home/foo/.config/k9s/skins/skin.yaml" 4:14PM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\"" 4:14PM DBG Factory START with ns `"default" 4:14PM ERR Load cluster resources - No API server connection 4:14PM ERR failed to list contexts error="no connection" 4:14PM DBG Fetching latest k9s rev... 4:14PM DBG K9s latest rev: "v0.32.4" 4:14PM DBG [Skin] Loading global skin ("skin") 4:14PM DBG Loading skin file: "/home/foo/.config/k9s/skins/skin.yaml" 4:14PM WRN Unable to dial discovery API error="no connection to dial" 4:14PM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\"" 4:14PM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\"" 4:14PM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\"" 4:14PM ERR Unable to assert active namespace. Using default error="getcontext - invalid context specified: \"\"" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR Load cluster resources - No API server connection 4:14PM WRN Unable to dial discovery API error="no connection to dial" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM WRN Plugins load failed: getcontext - invalid context specified: "" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR ClusterUpdater failed error="conn check failed (1/5)" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR ClusterUpdater failed error="conn check failed (2/5)" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR ClusterUpdater failed error="conn check failed (3/5)" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR ClusterUpdater failed error="conn check failed (4/5)" 4:14PM ERR can't connect to cluster error="Get \"http://localhost:8080/version?timeout=15s\": dial tcp [::1]:8080: connect: connection refused" 4:14PM ERR Conn check failed (5/5). Bailing out! 4:14PM WRN Save failed. no active config detected 4:14PM ERR nuking k9s shell pod error="getcontext - invalid context specified: \"\"" ```
mahdizojaji commented 3 months ago

i have this issue too

bakgaard commented 3 months ago

I got around my issue by specifying the context's namespace I had access to in my kube-config. I was not allowed to see all namespaces, but limiting it to a specific solved it.

mahdizojaji commented 3 months ago

I got around my issue by specifying the context's namespace I had access to in my kube-config. I was not allowed to see all namespaces, but limiting it to a specific solved it.

when we have mutiple cluster, we couldnt limit to specific namespace

tws-rdelatorre commented 3 months ago

The issue seems to be at the point where k9s tries to load plugins based on the context when said context is empty. I gave a PR a shot by just ignoring the loading of plugins from the context-based directory if the context comes empty.

J-eremy commented 3 months ago

Im also having this exact same issue

pcn commented 3 months ago

One additional situation that brings this up is when the gke-gcloud-auth-plugin isn't available, k9s can't authenticate to the cluster, but the errors aren't passed up to the user. If e.g. you are using asdf, you may have to asdf reshim to get the installed binaries to be available.

jkroepke commented 2 months ago

we are running k9s inside an container. In our case, there is no kubeconfig, only the in-cluster config.

It works months ago (until 0.29), but now it seems broken.

ep4sh commented 2 months ago

Facing the same on arch, kubernetes v1.28.2, k9s: v0.32.4

yogeshelke commented 2 months ago

Observing this issue for one of the context, rest all context works well K9s Rev: v0.32.5 K8s Rev: v1.29.4-eks-036c24b

BenjaminAlderson commented 2 months ago

FWIW, I was having a similar issue this morning after putting my mac to sleep with multiple k9s running last week, and the local namespace config.yaml had some trash appended to the end of the file. Once I fixed the config yaml it was fine, and I discovered this by viewing the namespaces in the context, and when I tried to "use" the namespace I was after, it said it was missing a ":" on line 20, which was the end of the file, where I found some trash (part of the word "localhost", but without the "loc" at the start). So, perhaps unrelated, but may have some issue with multiple k9s instances viewing the same namespace saving the config on shutdown when the machine goes idle/sleep.

majorku5anagi commented 2 months ago

This issue appeared to me when I decided to rename my initially given deafult context names into something more meaninful for me. After k9s restart, the issue appeared. What I found interesting is when I opened my ~/.kube/config file I found that the context names are correctly being renamed like I did in k9s throughout the file but - current-context still had old context name defined (name prior to my renaming action).

:warning: Which means that this field is not being updated by k9s accordingly after the renaming nor after choosing some other context.

When I manually defined the current-context to one of the new ones, it worked well, without issues but now it's hardcoded basically so I have to perform ctx to jump to another after initially being loaded to current-context. It feels like k9s should have a control over current-context value in ~/.kube/config and define it to be last chosen context.

Additionally, like it was previously said, it would be sweet to have also no definition of context so that you basically need to choose one right from the start without any problems (some flag in k9s config.yaml file).

PraneethGopinathan commented 1 month ago

I fixed this by exporting the config file to the native Kubernetes path microk8s config > ~/.kube/config After doing this, just open K9s, and it worked fine (this worked for me as the k9s by default looks for the contexts in default k8s path)

BigDaddyJay90 commented 3 days ago

When I manually defined the current-context to one of the new ones, it worked well, without issues but now it's hardcoded basically so I have to perform ctx to jump to another after initially being loaded to current-context. It feels like k9s should have a control over current-context value in ~/.kube/config and define it to be last chosen context.

Additionally, like it was previously said, it would be sweet to have also no definition of context so that you basically need to choose one right from the start without any problems (some flag in k9s config.yaml file).

This seems only to be a problem in k9s version 0.32.5. I had the same issue and fixed it with a roll back to k9s version 0.31.5. After that it worked just fine without setting a current-context in my config file.