nutanix-cloud-native / cluster-api-provider-nutanix

Kubernetes-native declarative infrastructure provider for Nutanix AHV
https://opendocs.nutanix.com/capx/latest/getting_started/
Apache License 2.0
42 stars 22 forks source link

CAPX clusterclass support #344

Closed deepakm-ntnx closed 10 months ago

deepakm-ntnx commented 11 months ago

What this PR does / why we need it: This PR adds support for clusterclass in CAPX. More indepth details on need for clusterclass can be found here https://cluster-api.sigs.k8s.io/tasks/experimental-features/cluster-class/

have kept the go/v3 kuberbuilder scffolding as is to reduce the code churn

Done

Dev Steps:

make docker-build
make deploy
make test-cc-cluster-create
make list-cc-cluster-resources
make test-cc-cluster-install-cni
make test-cc-cluster-delete
make test-e2e LABEL_FILTERS=clusterclass

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #

How Has This Been Tested?: Currently its tested manually as mentioned above dev steps. local test passed

make test-e2e-calico LABEL_FILTERS=clusterclass
...
When testing ClusterClass changes [ClusterClass] Should successfully rollout the managed topology upon changes to the ClusterClass [clusterclass, slow, network]
/Users/deepak.muley/go/pkg/mod/sigs.k8s.io/cluster-api/test@v1.3.5/e2e/clusterclass_changes.go:132
  STEP: Creating a namespace for hosting the "clusterclass-changes" test spec @ 01/05/24 12:04:31.426
  INFO: Creating namespace clusterclass-changes-qey1b9
  INFO: Creating event watcher for namespace "clusterclass-changes-qey1b9"
  STEP: Creating a workload cluster @ 01/05/24 12:04:31.451
  INFO: Creating the workload cluster with name "clusterclass-changes-u2v3sb" using the "topology" template (Kubernetes v1.28.4, 1 control-plane machines, 1 worker machines)
  INFO: Getting the cluster template yaml
  INFO: clusterctl config cluster clusterclass-changes-u2v3sb --infrastructure (default) --kubernetes-version v1.28.4 --control-plane-machine-count 1 --worker-machine-count 1 --flavor topology
  INFO: Applying the cluster template yaml to the cluster
configmap/clusterclass-changes-u2v3sb-pc-trusted-ca-bundle created
configmap/nutanix-ccm created
secret/clusterclass-changes-u2v3sb created
secret/nutanix-ccm-secret created
clusterresourceset.addons.cluster.x-k8s.io/nutanix-ccm-crs created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/clusterclass-changes-u2v3sb-kcfg-0 created
clusterclass.cluster.x-k8s.io/e2e created
kubeadmcontrolplanetemplate.controlplane.cluster.x-k8s.io/e2e-kcpt created
nutanixclustertemplate.infrastructure.cluster.x-k8s.io/e2e-nct created
nutanixmachinetemplate.infrastructure.cluster.x-k8s.io/e2e-cp-nmt created
nutanixmachinetemplate.infrastructure.cluster.x-k8s.io/e2e-md-nmt created
configmap/cni-clusterclass-changes-u2v3sb-crs-cni created
clusterresourceset.addons.cluster.x-k8s.io/clusterclass-changes-u2v3sb-crs-cni created
cluster.cluster.x-k8s.io/clusterclass-changes-u2v3sb created

  INFO: Waiting for the cluster infrastructure to be provisioned
  STEP: Waiting for cluster to enter the provisioned phase @ 01/05/24 12:04:34.228
  INFO: Waiting for control plane to be initialized
  INFO: Waiting for the first control plane machine managed by clusterclass-changes-qey1b9/clusterclass-changes-u2v3sb-rgb5c to be provisioned
  STEP: Waiting for one control plane node to exist @ 01/05/24 12:04:44.287
  INFO: Waiting for control plane to be ready
  INFO: Waiting for control plane clusterclass-changes-qey1b9/clusterclass-changes-u2v3sb-rgb5c to be ready (implies underlying nodes to be ready as well)
  STEP: Waiting for the control plane to be ready @ 01/05/24 12:05:54.374
  STEP: Checking all the control plane machines are in the expected failure domains @ 01/05/24 12:05:54.379
  INFO: Waiting for the machine deployments to be provisioned
  STEP: Waiting for the workload nodes to exist @ 01/05/24 12:05:54.401
  STEP: Checking all the machines controlled by clusterclass-changes-u2v3sb-md-0-k6z6w are in the "" failure domain @ 01/05/24 12:06:14.436
  INFO: Waiting for the machine pools to be provisioned
  STEP: Modifying the control plane configuration in ClusterClass and wait for changes to be applied to the control plane object @ 01/05/24 12:06:14.478
  INFO: Modifying the ControlPlaneTemplate of ClusterClass clusterclass-changes-qey1b9/e2e
  INFO: Waiting for ControlPlane rollout to complete.
  STEP: Modifying the MachineDeployment configuration in ClusterClass and wait for changes to be applied to the MachineDeployment objects @ 01/05/24 12:06:24.575
  INFO: Modifying the BootstrapConfigTemplate of MachineDeploymentClass "e2e-worker" of ClusterClass clusterclass-changes-qey1b9/e2e
  INFO: Waiting for MachineDeployment rollout for MachineDeploymentClass "e2e-worker" to complete.
  INFO: Waiting for MachineDeployment rollout for MachineDeploymentTopology "md-0" (class "e2e-worker") to complete.
  STEP: Rebasing the Cluster to a ClusterClass with a modified label for MachineDeployments and wait for changes to be applied to the MachineDeployment objects @ 01/05/24 12:06:34.667
  INFO: Waiting for MachineDeployment rollout to complete.
  INFO: Waiting for MachineDeployment rollout for MachineDeploymentTopology "md-0" (class "e2e-worker") to complete.
  STEP: Deleting a MachineDeploymentTopology in the Cluster Topology and wait for associated MachineDeployment to be deleted @ 01/05/24 12:06:44.977
  INFO: Removing MachineDeploymentTopology from the Cluster Topology.
  INFO: Waiting for MachineDeployment to be deleted.
  STEP: PASSED! @ 01/05/24 12:06:55.066
  STEP: Dumping logs from the "clusterclass-changes-u2v3sb" workload cluster @ 01/05/24 12:06:55.066
Failed to get logs for Machine clusterclass-changes-u2v3sb-rgb5c-5sb62, Cluster clusterclass-changes-qey1b9/clusterclass-changes-u2v3sb: error creating container exec: Error response from daemon: No such container: clusterclass-changes-u2v3sb-rgb5c-5sb62
  STEP: Dumping all the Cluster API resources in the "clusterclass-changes-qey1b9" namespace @ 01/05/24 12:06:55.142
  STEP: Deleting cluster clusterclass-changes-qey1b9/clusterclass-changes-u2v3sb @ 01/05/24 12:06:55.386
  STEP: Deleting cluster clusterclass-changes-u2v3sb @ 01/05/24 12:06:55.429
  INFO: Waiting for the Cluster clusterclass-changes-qey1b9/clusterclass-changes-u2v3sb to be deleted
  STEP: Waiting for cluster clusterclass-changes-u2v3sb to be deleted @ 01/05/24 12:06:55.479
  STEP: Deleting namespace used for hosting the "clusterclass-changes" test spec @ 01/05/24 12:07:15.635
  INFO: Deleting namespace clusterclass-changes-qey1b9
• [164.231 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[SynchronizedAfterSuite] 
/Users/deepak.muley/go/src/github.com/deepakm-ntnx/cluster-api-provider-nutanix/test/e2e/e2e_suite_test.go:113
  STEP: Dumping logs from the bootstrap cluster @ 01/05/24 12:07:15.659
Failed to get logs for the bootstrap cluster node test-6vt5e4-control-plane: exit status 1
  STEP: Tearing down the management cluster @ 01/05/24 12:07:16.138
[SynchronizedAfterSuite] PASSED [1.757 seconds]
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.003 seconds]
------------------------------

Ran 1 of 34 Specs in 313.076 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 33 Skipped

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration and test output

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

codecov[bot] commented 11 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (ba2b194) 15.14% compared to head (31a01eb) 15.21%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #344 +/- ## ========================================== + Coverage 15.14% 15.21% +0.07% ========================================== Files 17 18 +1 Lines 1208 1209 +1 ========================================== + Hits 183 184 +1 Misses 1025 1025 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

deepakm-ntnx commented 11 months ago

Please note the TODO list in description. while adding e2e tests, it was discovered that we need to reorg the templates folder yaml with respect to bases else it casues issues in generate e2e templates. working on the same

base
- Cluster without topology
- KubeadmControlPlane
- KubeadmConfigTemplate
- NutanixCluster
- NutanixMachineTemplate
- MachineDeployment
- ConfigMap
- Secret
Overlay: Cluster class
- Template patches
    - Add ClusterClass
    - Add KubeadmControlPlaneTemplate
    - Update Cluster with topology
    - Kustomize
        - Include everything from base
        - Drop NutanixCluster
        - Drop MachineDeployment
        - Drop KubeadmControlPlane
jimmidyson commented 11 months ago

Oops sorry @deepakm-ntnx I didn't see the TODO about the clusterclass variables and patches and commented 🫣

dlipovetsky commented 11 months ago

For anyone exercising this in a shared PC: Your cluster name must be unique. Here, it's determined by the TEST_CLUSTER_NAME make variable, e.g., make test-cc-cluster-create TEST_CLUSTER_NAME=<unique name>.

thunderboltsid commented 11 months ago

/retest

thunderboltsid commented 11 months ago

/ok-to-test

deepakm-ntnx commented 11 months ago

/test e2e-ncn-1-calico-k8s-v1.26.1

deepakm-ntnx commented 11 months ago

/retest

deepakm-ntnx commented 11 months ago

/test e2e-ncn-1-calico-k8s-v.1.27-api-upgrade

deepakm-ntnx commented 11 months ago

/retest

deepakm-ntnx commented 10 months ago

/test e2e-ncn-1-calico-k8s-v.1.27-api-upgrade

thunderboltsid commented 10 months ago

/test e2e-k8s-upgrade /test e2e-capx-conformance /test e2e-capx-controller-upgrade

thunderboltsid commented 10 months ago

/test e2e-k8s-upgrade /test e2e-capx-controller-upgrade

thunderboltsid commented 10 months ago

/test e2e-k8s-upgrade /test e2e-capx-controller-upgrade

thunderboltsid commented 10 months ago

/lgtm /approve

thunderboltsid commented 10 months ago

/unhold

thunderboltsid commented 10 months ago

/retest

nutanix-cn-prow-bot[bot] commented 10 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adiantum, deepakm-ntnx, dkoshkin, thunderboltsid

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/nutanix-cloud-native/cluster-api-provider-nutanix/blob/main/OWNERS)~~ [adiantum,deepakm-ntnx,thunderboltsid] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment