vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.76k stars 1.41k forks source link

Velero does not authenticate to multipule clusters when switching contexts #1637

Closed switchboardOp closed 5 years ago

switchboardOp commented 5 years ago

What steps did you take and what happened: I installed velero in an on-prem cluster. with backup storage located in Azure.

I have multiple kube contexts in multiple files that are unified with the KUBECONFIG environment variable.

Velero does not authenticate to both clusters when switching contexts, but does work when the kubeconfig file is specified.

In dev context I can list pods and issue velero commands:

$ [☸ dev:velero] k get pods 
NAME                      READY   STATUS    RESTARTS   AGE
velero-5dd4bbdd9c-dtrsz   1/1     Running   0          22h
$ [☸ dev:velero] velero version 
Client:
    Version: v1.0.0
    Git commit: -
Server:
    Version: v1.0.0

After switching to prod context I can list pods but not issue velero commands:

$ [☸ dev:velero] kubectx prod
Switched to context "prod".
$ [☸ prod:velero] k get pods 
NAME                      READY   STATUS    RESTARTS   AGE
velero-5dd4bbdd9c-59jxr   1/1     Running   0          22h
$ [☸ prod:velero] velero version   
Client:
    Version: v1.0.0
    Git commit: -
<error getting server version: the server has asked for the client to provide credentials (post serverstatusrequests.velero.io)>

When explicitly specifying a kube config file I can issue velero commands in prod context:

$ [☸ prod:velero] KUBECONFIG=~/.kube/prod-kube-config velero version 
Client:
    Version: v1.0.0
    Git commit: -
Server:
    Version: v1.0.0

What did you expect to happen: I expected Velero to pick up the context from kubectl and authenticate against the api server

The output of the following commands will help us better understand what's going on: (Pasting long output into a GitHub gist or other pastebin is fine.)

Anything else you would like to add: I'm using kubectx to switch contexts, but the results are the same with the native kubectl config commands. Both clusters are essentially identical as far as configuration. Both dev and prod contexts are set up to use the Rancher auth proxy.

Environment:

nrb commented 5 years ago

I made the following change on my local machine to see what context was being passed into Velero's client.

--- i/pkg/client/client.go
+++ w/pkg/client/client.go
@@ -34,6 +34,8 @@ func Config(kubeconfig, kubecontext, baseName string) (*rest.Config, error) {
        loadingRules.ExplicitPath = kubeconfig
        configOverrides := &clientcmd.ConfigOverrides{CurrentContext: kubecontext}
        kubeConfig := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(loadingRules, configOverrides)
+       rawConfig, _ := kubeConfig.RawConfig()
+       fmt.Printf("kubecontext: %s\n", rawConfig.CurrentContext)
        clientConfig, err := kubeConfig.ClientConfig()
        if err != nil {
                return nil, errors.WithStack(err)

I'm currently operating with a single kubeconfig file.

On my normal context, I see this:

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% velero version
kubecontext: admin@kubernetes
Client:
        Version: master
        Git commit: 63964fc6f9df327919d27facccf31448bb36f07f-dirty
Server:
        Version: master

Next, I try to make a new, empty context and use it.

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% k config set-context context2
Context "context2" created.

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% k config use-context context2
Switched to context "context2".

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% velero version
kubecontext: context2
An error occurred: invalid configuration: no configuration has been provided

The same with kubectx:

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% kubectx admin@kubernetes
Switched to context "admin@kubernetes".

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% velero version
kubecontext: admin@kubernetes
Client:
        Version: master
        Git commit: 63964fc6f9df327919d27facccf31448bb36f07f-dirty
Server:
        Version: master

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% kubectx context2
Switched to context "context2".

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% velero version
kubecontext: context2
An error occurred: invalid configuration: no configuration has been provided

Interestingly, this variable doesn't get resolved as I'd expect if I use Velero's --kubecontext parameter. The RawConfig should have the override value where I print, but what's there is not the value explicitly passed in.

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% kubectx admin@kubernetes
Switched to context "admin@kubernetes".

x1c in /home/nrb/go/src/github.com/heptio/velero (git) master U
% velero version --kubecontext context2
kubecontext: admin@kubernetes
An error occurred: invalid configuration: no configuration has been provided

So, so far I'm unable to reproduce this, but I likely need to dig deeper into the client-go code to figure out what's going on.

skriss commented 5 years ago

I tested this out and was unable to reproduce. I have a $KUBECONFIG with a value of ~/.kube/aws-config:~/.kube/gke-config, one context in each of those files, and I'm able to switch contexts using kubectx and properly run both kubectl and velero commands against both clusters/contexts.

@switchboardOp wondering if there's something else going on? What's different in your setup vs. what I described?

switchboardOp commented 5 years ago

It looks like I was actually running into a know issue with how Rancher constructs its kubeconfig files for multiple clusters. It was strange though because at the time I only had the issue when using velero. Since implementing a workaround I don't have the issue anymore. I don't think there's anything wrong with how velero is handling kubeconfig.

Sorry for the confusion.

https://github.com/rancher/rancher/issues/19342