GoogleCloudPlatform / cloud-builders

Builder images and examples commonly used for Google Cloud Build
https://cloud.google.com/cloud-build/
Apache License 2.0
1.38k stars 590 forks source link

[BUG] kubectl builder finds wrong context when executed in parallel #990

Closed joekim-0 closed 4 months ago

joekim-0 commented 7 months ago

I have a cloudbuild.yaml file which tries to deploy services onto multiple clusters. However, I have been experiencing issues with deployments getting sent to the wrong cluster, and in debugging have run into some unexpected behavior. In my testing, if 2 build steps execute kubectl config get-contexts at the same time while pointing at different clusters, one of them will be incorrect.

Affected builder image

gcr.io/cloud-builders/kubectl

Expected Behavior

The correct kubectl context is reported, and further actions executed by the builder take place in the indicated context

Actual Behavior

When both kubectl config get-contexts commands are run at roughly the same time against different clusters, both steps indicate the same context.

Steps to Reproduce the Problem

  1. Create a cloudbuild.yml file as follows:

    steps:
    - id: 'Get context #1'
    name: 'gcr.io/cloud-builders/kubectl'
    waitFor: [ '-' ]
    args: [ 'config', 'get-contexts' ]
    env:
      - CLOUDSDK_COMPUTE_ZONE=<zone 1>
      - CLOUDSDK_CONTAINER_CLUSTER=<cluster 1>
    
    - id: 'Get context #2'
    name: 'gcr.io/cloud-builders/kubectl'
    waitFor: [ '-' ]
    args: [ 'config', 'get-contexts' ]
    env:
      - CLOUDSDK_COMPUTE_ZONE=<zone 2>
      - CLOUDSDK_CONTAINER_CLUSTER=<cluster 2>
  2. Execute the build

  3. Observe cloudbuild logs - One of step 0 or step 1 will be incorrect, and should have logs something like this:

    Already have image (with digest): gcr.io/cloud-builders/kubectl
    Running: gcloud container clusters get-credentials --project=<project> --zone=<zone 1> <cluster 1>
    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for <cluster 1>.
    Running: kubectl config get-contexts
    CURRENT   NAME         CLUSTER        AUTHINFO         NAMESPACE
    *         <cluster 2>   <cluster 2>   <cluster 2>

It seems random to me as to which context actually gets chosen, but in several runs of this sample build I have never observed both contexts showing up correctly

haroonc commented 4 months ago

This builder uses gcloud container clusters get-credentials to get cluster credentials that are written to HOME/.kube/config by default. Since all Cloud Build steps share the same volume for the HOME directory, you can't use different clusters for the same build without overriding the KUBECONFIG location. For your use case you can define environment variable KUBECONFIG for each step that uses kubectl builder to ensure isolation of cluster auth config. e.g.

steps:
  - id: 'Get context #1'
    name: 'gcr.io/cloud-builders/kubectl'
    waitFor: [ '-' ]
    args: [ 'config', 'get-contexts' ]
    env:
      - CLOUDSDK_COMPUTE_ZONE=<zone 1>
      - CLOUDSDK_CONTAINER_CLUSTER=<cluster 1>
      - KUBECONFIG=/tmp/cluster_1

  - id: 'Get context #2'
    name: 'gcr.io/cloud-builders/kubectl'
    waitFor: [ '-' ]
    args: [ 'config', 'get-contexts' ]
    env:
      - CLOUDSDK_COMPUTE_ZONE=<zone 2>
      - CLOUDSDK_CONTAINER_CLUSTER=<cluster 2>
      - KUBECONFIG=/tmp/cluster_2
  -