kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.54k stars 1.3k forks source link

Simplify external patch definition in ClusterClass #8210

Open killianmuldoon opened 1 year ago

killianmuldoon commented 1 year ago

When a user wants to include an external patch in a ClusterClass they have to specify the following in the ClusterClass .spec

  patches:
  - name: test-patch
    external:
      generateExtension: generate-patches.k8s-upgrade-with-runtimesdk
      validateExtension: validate-topology.k8s-upgrade-with-runtimesdk
      discoverVariablesExtension: discover-variables.k8s-upgrade-with-runtimesdk

This issue is about considering the following improvements to the API:

  1. Change generateExtension and validateExtension to generatePatchExtension and validatePatchExtension respectively to make them consistent with discoverVariablesExtension

  2. Allow users to specify an external patch using a single field. This would result in something like:

    patches:
    - name: test-patch
    external:
      extension: k8s-upgrade-with-runtimesdk

    Where the three runtime hooks - generateExtension, validateExtension and discoverVariablesExtension - can be inferred from the extension through the Runtime SDK discovery process.

/area topology /area runtime-sdk /kind api-change

sbueringer commented 1 year ago

/triage accepted

adam-jian-zhang commented 1 year ago

@killianmuldoon , which release will this feature be implemented, 1.14 or future? After this, do we support short version only, or both? This will impact design decisions for next version of TKG,

I am considering multiple hooks in the same ExtensionConfig, different endpoint for supporting different k8s tkr version, something like,

apiVersion: runtime.cluster.x-k8s.io/v1alpha1
kind: ExtensionConfig
metadata:
  name: tkg-runtimesdk-2.3
spec:
  clientConfig:
    caBundle: <...>
    service:
      name: tkg-extension-2.3-webhook-service
      namespace: extension-system
      port: 443
  settings:
    defaultAllHandlersToBlocking: "true"
# For CC-<2.3, 1.25>
patches:
- name: tkg-patch
  external:
  generateExtension: generate-patches-1.25.tkg-runtimesdk-2.3 # version number in the hook definition
  validateExtension: validate-topology-1.25.tkg-runtimesdk-2.3
  discoverVariablesExtension: discover-variables-1.25.tkg-runtimesdk-2.3
# For CC-<2.3, 1.26>
patches:
- name: tkg-patch
  external:
  generateExtension: generate-patches-1.26.tkg-runtimesdk-2.3 # version number in the hook definition
  validateExtension: validate-topology-1.26.tkg-runtimesdk-2.3
  discoverVariablesExtension: discover-variables-1.26.tkg-runtimesdk-2.3

What do you think? Or should I organize different k8s tkr in different service?

killianmuldoon commented 1 year ago

@killianmuldoon , which release will this feature be implemented, 1.14 or future? After this, do we support short version only, or both? This will impact design decisions for next version of TKG,

This won't be part of v1.4 - and I think if this was implemented both the short and the long versions would be valid - I think there's probably a use case for keeping the longer config.

I am considering multiple hooks in the same ExtensionConfig, different endpoint for supporting different k8s tkr version

An alternative could be to have one ExtensionConfig for each version - but each config can still be backed by the same extension component. I don't think there's specific guidance on which way to do this yet.

Side note: generate-patches-1.26 is an invalid name as the endpoint can't have a . in it

sbueringer commented 1 year ago

You could also just have one single Runtime Extension with one single ExtensionConfig. You can then pass in things like versions / ClusterClass "type" via patches[].external.settings. The Runtime Extension can then internally have different code paths to handle various versions / types / ... .

This might be a lot simpler then deploying a lot of Runtime Extensions

(cc @fabriziopandini)

fabriziopandini commented 1 year ago

I would advise against considering a CC and the corresponding Runtime Extension linked to a specific Kubernetes version, this seems too limiting (see e.g the cluster class we are using for E2E tests).

I also agree with @sbueringer's comment, because reducing the number of RuntimeExtensions reduces the operational complexity, and the matrix of variants you have to manage can be easily addressed by a combination of settings and some high-level select statements in the RuntimeExtension implementation.

Another important advantage of this model is that you can test migration from one Kubernetes/CC version to another, check for regressions, rollouts etc. this would instead be tricky if working with many separated RuntimeExtensions.

fabriziopandini commented 6 months ago

/priority backlog