kubernetes-sigs / kubebuilder-declarative-pattern

A toolkit for building declarative operators with kubebuilder
Apache License 2.0
252 stars 84 forks source link

Better `status.conditions` #308

Closed yuwenma closed 1 year ago

yuwenma commented 1 year ago

This PR is based on https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern/pull/272 which needs to go in first.

Description

This PR adds a status.conditions according to the API convention "typical status properties" and the kstatus abnormal-true conditions

The conditions logic is described as below:

Examples

Example for case "All manifests are reconciled"

kind: MyCR
...
status:
  conditions:
  - lastTransitionTime: "2023-02-17T03:56:20Z"
    message: all manifests are reconciled
    reason: Normal
    status: "False"
    type: Success
  healthy: true
  phase: Current

Example for case "some manifests are InProgress or Terminating"

kind: MyCR
status:
  conditions:
  - lastTransitionTime: "2023-02-17T03:51:28Z"
    message: |
         apps/v1, Kind=Deployment/argocd/argocd-repo-server:Deployment does not have minimum availability.
         apps/v1, Kind=Deployment/argocd/argocd-server:Deployment does not have minimum availability.
    reason: ContainAbnormalTrueConditions
    status: "True"
    type: Reconciling
  healthy: false
  phase: InProgress

Example for case "known error (AppliedFailed)"

status:
  conditions:
  - lastTransitionTime: "2023-02-17T05:07:05Z"
    message: |
      apps/v1, Kind=StatefulSet/argocd/argocd-application-controller:Ready: 0/1
    reason: ContainAbnormalTrueConditions
    status: "True"
    type: Reconciling
  healthy: false
  phase: Applying

Alternatives

Because conditions themselves should not be state machines, we don't really need to capture all manifests' conditions in one. But we can have each manifest's present abnormal-true condition to represent a declarativeObject condition, as shown below.

kind: MyCR
status:
  conditions:
  # deployment A
    - lastTransitionTime: "2023-02-17T05:23:40Z"
       message: Deployment does not have minimum availability.
       reason: MinimumReplicasUnavailable
       status: "False"
       type: Available
   # deployment B
    - lastTransitionTime: "2023-02-17T05:13:38Z"
       message: Deployment does not have minimum availability.
       reason: MinimumReplicasUnavailable
       status: "False"
       type: Available
  healthy: false
  phase: InProgress

The downside is that the declarativeObject itself's condition then can no longer be computed or augmented by kstatus correctly (it can no longer map the legacy type/status to k8s resources since the resource now is MyCR).

yuwenma commented 1 year ago

cc @justinsb

yuwenma commented 1 year ago

Updates:

Per offline discussion with @justinsb and kstatus author @mortent, this PR is simplified to reflect the current status as a single condition with type "ready" and status true/false. The status true means all the manifests are reconciled, and false means the manifests are in progress (or in bad situation but k-d-p cannot determine whether it can be self-healed and reconciled later ).

Example

Example for case "some manifests are InProgress or Terminating"

kind: MyCR
...
status:
  conditions:
  - lastTransitionTime: "2023-02-23T09:07:38Z"
    message: |
      apps/v1, Kind=StatefulSet/mycr/mycr-application-controller:Ready: 0/1
    reason: AbnormalTrueConditions
    status: "False"
    type: Ready
  healthy: false
  phase: InProgress

Example for case "All manifests are reconciled"

kind: MyCR
...
status:
  conditions:
  - lastTransitionTime: "2023-02-23T09:17:57Z"
    message: all manifests are reconciled.
    reason: Normal
    status: "True"
    type: Ready
  healthy: false
  phase: Current
justinsb commented 1 year ago

Some nits, looks really good. I think / hope if you rebase the tests will be much less flaky

justinsb commented 1 year ago

Two small things that we should fix, but this looks good and I am happy for you to fix those things in a separate PR if that is easier. So

/approve /lgtm

/hold

You should cancel the hold yourself if you agree and want to address the issues in a follow-on PR, but if you want to push a fix into this PR that works too.

yuwenma commented 1 year ago

/hold cancel

justinsb commented 1 year ago

/approve /lgtm

k8s-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, yuwenma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern/blob/master/OWNERS)~~ [justinsb] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment