vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.55k stars 1.38k forks source link

velero CLI fails to authenticate with complext KUBECONFIG #2097

Closed rombert closed 3 years ago

rombert commented 4 years ago

What steps did you take and what happened:

I am using the velero CLI with a KUBECONFIG env variable set to multiple files, e.g.

$HOME/.config/kube/cluster001-kubeconfig.yaml:$HOME/.config/kube/foo.yaml:$HOME/.config/kube/homelab.yaml:

The CLI is unable to authenticate by default:

$ velero version
Client:
    Version: v1.2.0
    Git commit: 5d008491bbf681658d3e372da1a9d3a21ca4c03c
<error getting server version: Unauthorized>

On the other hand, when settubg the env variable to a single file - which is the cluster where velero is deployed - velero works

$ KUBECONFIG=~/.config/kube/cluster001-kubeconfig.yaml velero version
Client:
    Version: v1.2.0
    Git commit: 5d008491bbf681658d3e372da1a9d3a21ca4c03c
Server:
    Version: v1.1.0

What did you expect to happen:

The CLI tool should support the same type of KUBECONFIG setings that kubectl does and should

The output of the following commands will help us better understand what's going on:

(not applicable)

Environment:

not sure

Linode Kubernetes Engine

markrity commented 4 years ago

@rombert I have just tested this with similar configuration to yours, for it me it works fine , do you still have this issue?

rombert commented 4 years ago

@markrity - I'll double-check and let you know

rombert commented 4 years ago

@markrity - this still fails for me. In the meantime I upgraded kubectl to 1.17.0, but that did not help.

I tried debugging a bit myself, and have some extra information, maybe that rings some bells:

The following scenarios work:

The following scenarios do not work:

I ran the velero -v 9 version command for both working and failing scenarios, and the HTTP calls seem to be the same for both. Notably, the server IP is the correct one.

The C file references a development cluster running on a private network on my machine, e.g.

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: <REDACTED>
    server: https://10.24.0.70:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    namespace: datadog
    user: kubernetes-admin
  name: homelab
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: <REDACTED>
    client-key-data: <REDACTED>

A couple of questions:

Thanks!

markrity commented 4 years ago

@rombert The only scenario that is missing in my point of view is KUBECONFIG=C. Because it looks like that it is the common value for a failure.

I will take a deeper look soon, thanks for providing more information !

rombert commented 4 years ago

@markrity - I did not try that since the cluster is configured with file A, so using C would not lead to a meaningful result.

nrb commented 3 years ago

@rombert @markrity Thank you both for looking into this!

Velero's been updated with new Kubernetes APIs since January - is this still happening with v1.4.x and v1.5.x? Admittedly I'm not running against multiple clusters, but if it's still happening, I'll get it allocated into the backlog.

rombert commented 3 years ago

@nrb - I just checked and the problem does not seem to be there anymore

$ velero version
Client:
    Version: v1.4.2
    Git commit: 56a08a4d695d893f0863f697c2f926e27d70c0c5
Server:
    Version: v1.3.1

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-15T00:00:00Z", GoVersion:"go1.15.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.12", GitCommit:"5ec472285121eb6c451e515bc0a7201413872fa3", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:12Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
nrb commented 3 years ago

@rombert Interesting, thanks! I'm going to close this issue out for now. Of course, feel free to comment again if you run into further problems.

STYNET commented 2 years ago

I use Velero v1.8.1 and i am a same problem.

When I use a multiples contexts :

$ echo $KUBECONFIG 
~/.kube/config:~/.kube/config:/home/xxxxx/Documents/XXX/k8s-ovh/kubeconfig.yml

I have error connect :

$ velero debug
.2022/06/17 14:38:19 Collecting velero resources in namespace: velero
An error occurred: exec failed: Traceback (most recent call last):
  velero-debug-collector:23:13: in <toplevel>
  <builtin>: in kube_get
Error: could not initialize search client: invalid configuration: [context was not found for specified context: kubernetes-admin@XXX, cluster has no server defined]

But, the context error message disappears if i reconfigure my KUBECONFIG with only one context : $ export KUBECONFIG='/home/xxxxx/Documents/XXX/k8s-ovh/kubeconfig.yml'

$ velero debug 
2022/06/17 14:54:32 Collecting velero resources in namespace: velero
2022/06/17 14:54:34 Collecting velero deployment logs in namespace: velero
2022/06/17 14:54:35 Generated debug information bundle: ~/bundle-2022-06-17-14-54-32.tar.gz

It's a bug or bad use ?