bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.68k stars 512 forks source link

`cargo make test`: TestSys not creating vsphere credential K8s secret when it should #2658

Closed etungsten closed 1 year ago

etungsten commented 1 year ago

What I expected to happen: After exporting all the GOVC_* variables as described in https://github.com/bottlerocket-os/bottlerocket/blob/develop/TESTING.md#vmware-k8s, I expect cargo make test for vmware-k8s variant testing to work successfully without having to manually create the vsphere credentials in the TestSys cluster prior to running the tests.

What actually happened:

cargo make test creates the vsphere-k8s-cluster-resource-agent but it's missing vsphere credentials so the agent fails:

Checking logs:

bash-4.2$ kubectl --kubeconfig "${TESTSYS_KUBECONFIG}" logs -f x86-64-vmware-kef8e9be5-316d-4077-b68f-c42174a14397-creatibqnqk -n testsys                                                                                         
[2022-12-13T21:51:04Z INFO  resource_agent::agent] Initializing Agent
[2022-12-13T21:51:04Z ERROR resource_agent::agent] Resource action failed: Provider error: An error occurred but no resources were left behind, Unable to fetch vSphere credentials
[2022-12-13T21:51:04Z INFO  resource_agent::agent] 'keep_running' is true.

Checking resource object spec, notice how the secret is empty:

bash-4.2$ kubectl --kubeconfig "${TESTSYS_KUBECONFIG}" describe resources/x86-64-vmware-k8s -124 -n testsys
...
Spec:
  Agent:
    Capabilities:
    Configuration:
...
      Labels:
        testsys/arch:                  x86_64
        testsys/cluster:               x86-64-vmware-k8s-124
        testsys/controlPlaneEndpoint:  198.18.16.134
        testsys/type:                  cluster
        testsys/variant:               vmware-k8s-1.24
...
      Name:                            x86-64-vmware-k8s-124
      Ova Name:                        bottlerocket-vmware-k8s-1.24-x86_64-v1.12.0.ova
      Privileged:                      true
      Secrets:
...
      Vcenter Datacenter:       SDDC-Datacenter
      Vcenter Datastore:        WorkloadDatastore
      Vcenter Host URL:         ...
      Vcenter Network:          ...
      Vcenter Resource Pool:    ...
      Vcenter Workload Folder:  ...
      Version:                  v1.24
    Image:                      public.ecr.aws/bottlerocket-test-system/vsphere-k8s-cluster-resource-agent:v0.0.3
    Keep Running:               true
    Name:                       agent
    Privileged:                 true
    Pull Secret:                <nil>
    Secrets:
    Timeout:  <nil>
  Conflicts With:
  Depends On:
  Destruction Policy:  onTestSuccess

I also tried not exporting GOVC_* vars and just specifying the vmware datacenter config in Infra.toml and same thing happens.

How to reproduce the problem:

Either export GOVC_* or specify vmware.datacenter in Infra.toml. Run

    cargo make \
      -e TESTSYS_TEST=migration \
      -e BUILDSYS_VARIANT="vmware-k8s-1.24" \
      -e BUILDSYS_ARCH="x86_64" \
      -e PUBLISH_INFRA_CONFIG_PATH="Infra.migration.toml" \
      -e TESTSYS_TEST_CONFIG_PATH="Test.toml" \
      test \
      --mgmt-cluster-kubeconfig="mgmt-cluster-eks-a-cluster.kubeconfig"
etungsten commented 1 year ago

The problem I think has to do with the fact that we are only pulling vmware.datacenter stuff with pubsys-config::DatacenterBuilder: https://github.com/bottlerocket-os/bottlerocket/blob/f3dfa53daca4b17a0de4e2ac7bd94c4195ab5e60/tools/testsys/src/run.rs#L243-L256

But we also should be using pubsys-config::DatacenterCredsBuilder to fetch the vSphere credentials: https://github.com/bottlerocket-os/bottlerocket/blob/f3dfa53daca4b17a0de4e2ac7bd94c4195ab5e60/tools/pubsys-config/src/vmware.rs#L151-L158

In addition, when we create the vsphere-k8s-cluster custom resource, we're passing the secrets like so: https://github.com/bottlerocket-os/bottlerocket/blob/f3dfa53daca4b17a0de4e2ac7bd94c4195ab5e60/tools/testsys/src/vmware_k8s.rs#L139 which pulls that value from Test.toml. I think if the secrets field is not in Test.toml TestSys should create that secret and pass the name there.

ecpullen commented 1 year ago

Currently, a user is supposed to add the vSphere credentials secret beforehand and add it to Test.toml. I can have testsys automatically create the secret.