medik8s / node-healthcheck-operator

K8s Node Health Check Operator
Apache License 2.0
88 stars 18 forks source link

Do not set timeout annotation on already succeeded remediation CR #318

Closed clobrano closed 4 months ago

clobrano commented 5 months ago

Why we need this PR

In case a remediator concludes its remediation attempt(s) without failures, but the Node is still unhealthy, it still has to set the Succeded status to True.

In this case, however, the remediation will times out and NHC will set the related annotation, changing the Remediator's Succeded status to False.

Changes made

Before setting the timeout annotation on the remediation CR, it ensures that the Succeeded status is not True already.

Which issue(s) this PR fixes

https://issues.redhat.com/browse/ECOPROJECT-1881

Test plan

Added a new unit-test to verify that if a Remediation sets the succeded status True without fixing the node, NHC does not set the time out annotation.

openshift-ci[bot] commented 5 months ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci[bot] commented 5 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: clobrano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/medik8s/node-healthcheck-operator/blob/main/OWNERS)~~ [clobrano] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
clobrano commented 5 months ago

/test 4.15-openshift-e2e

clobrano commented 5 months ago

/test 4.15-openshift-e2e

clobrano commented 4 months ago

/test 4.15-openshift-e2e

mshitrit commented 4 months ago

/lgtm

clobrano commented 4 months ago

merging manually as tide has some problems at the moment