kubernetes / community

Kubernetes community content
Apache License 2.0
11.98k stars 5.17k forks source link

Template examples for MachineHealthCheck missing #6957

Closed cablunar closed 1 year ago

cablunar commented 1 year ago

Describe the issue

The cluster-api-provider-aws should accomodate self healing kubeadmcontrolplane template examples, in the tempaltes folder.

Proposing to add the following code to cluster-template-machinepool.yaml, this enables kubeadmcontrolplanes to heal itself if an instance is taken down or otherwise unhealthy. This is crutual information for at least production scenarios, where a clusters controlplane could be destroyed thought the aws console.

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
  name: "${CLUSTER_NAME}-kcp-unhealthy-5m"
spec:
  clusterName: "${CLUSTER_NAME}"
  maxUnhealthy: 100%
  selector:
    matchLabels:
      cluster.x-k8s.io/control-plane: ""
  unhealthyConditions:
    - type: Ready
      status: Unknown
      timeout: 300s
    - type: Ready
      status: "False"
      timeout: 300s
cablunar commented 1 year ago

/sig k8s-sig-cluster-lifecycle

k8s-ci-robot commented 1 year ago

@cablunar: The label(s) sig/k8s-sig-cluster-lifecycle cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes/community/issues/6957#issuecomment-1303536996): >/sig k8s-sig-cluster-lifecycle Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
cablunar commented 1 year ago

/sig cluster-lifecycle