kubernetes-sigs / kubebuilder-declarative-pattern

A toolkit for building declarative operators with kubebuilder
Apache License 2.0
251 stars 84 forks source link

Track info.Err in the status field #389

Open haiyanmeng opened 1 month ago

haiyanmeng commented 1 month ago

This allows the reconcilation errors such as errors encountered in the object transformers to be tracked in the field status.errors.

Currently, if errors happens during the execution of an object tranformer, the status field is set to:

status:
  healthy: false
  phase: InternalError

With this CL, the reconciliation errors are tracked in the field status.errors:

status:
  errors:
  - 'error building deployment objects: a test error'
  healthy: false
  phase: InternalError
k8s-ci-robot commented 1 month ago

Hi @haiyanmeng. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
yuwenma commented 1 month ago

two questions:

  1. I think we normally use the conditions for detailed error information.
  2. If tracking an object error, the errors should reflect the current status rather than a error history. Thus, I think we shall have a way to clean up stale errors.
haiyanmeng commented 1 month ago
  1. I think we normally use the conditions for detailed error information.

How does the status.errors field being used today? Is it always empty?

  1. If tracking an object error, the errors should reflect the current status rather than a error history. Thus, I think we shall have a way to clean up stale errors.

Great point. Fixed it.

tomasaschan commented 1 month ago

I agree this would be better to add to conditions rather than errors.

Not quite sure how errors is used today, though, but I feel like whatever it is used for should probably, eventually, be ported over to be exposed in conditions instead. Of course, that should not block you from exposing this info on the resource status, but we might want to be a little strict about adding more things to errors if we already know we might want to migrate away from it...

tomasaschan commented 1 month ago

Fwiw, this relates to #190.

haiyanmeng commented 1 month ago

Not quite sure how errors is used today, though, but I feel like whatever it is used for should probably, eventually, be ported over to be exposed in conditions instead.

Do we plan to deprecate the fields healthy and phase too?

type CommonStatus struct {
    Healthy bool     `json:"healthy"`
    Errors  []string `json:"errors,omitempty"`
    Phase   string   `json:"phase,omitempty"`
}
tomasaschan commented 1 month ago

I don't know if the project has actually made any decisions in this direction, but I think we should 😸

I base this mostly on the Kubernetes API Conventions guide, where the following passage can be found:

Some resources in the v1 API contain fields called phase, and associated message, reason, and other status fields. The pattern of using phase is deprecated. Newer API types should use conditions instead.

But I do think this change makes the behavior better than it currently is, so I'm a little on the fence on whether to just say "that's a project for another day" and go with this as it is...

k8s-ci-robot commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haiyanmeng, justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern/blob/master/OWNERS)~~ [justinsb] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
justinsb commented 1 month ago

Sorry, missed a lot of the context here when I approved this in VSCode!

I agree that we want to move to status.conditions. I think that will (likely?) have to be opt-in at the controller level; likely new controllers will stop using the "old" status fields altogether and existing controllers will support both (?). I think if we do that it can be something we can implement relatively quickly, at least for the new case.

I think adding this to the "old" status fields (healthy/phase/errors) makes sense in the interim / for controllers that choose not to support the "new" status fields (conditions)

justinsb commented 1 month ago

/test pull-declarative-test

(In the hope that it's just a flake .. we should still fix flakes, but we don't need to fix it in this PR)

haiyanmeng commented 1 month ago

/test pull-declarative-test

k8s-ci-robot commented 1 month ago

@haiyanmeng: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to [this](https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern/pull/389#issuecomment-2155701068): >/test pull-declarative-test Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
haiyanmeng commented 1 month ago

Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

@justinsb , how can I become a trusted user? This would allow me to trigger tests.

tomasaschan commented 3 weeks ago

/ok-to-test

k8s-ci-robot commented 3 weeks ago

New changes are detected. LGTM label has been removed.

k8s-ci-robot commented 3 weeks ago

@haiyanmeng: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-declarative-test 1715db6ae9fa38b46cdfa9069c0b8edb48c9e642 link true /test pull-declarative-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).