Closed tardieu closed 1 month ago
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: astefanutti
The full list of commands accepted by this bot can be found here.
The pull request process is described here
The
codeflare-operator
log is littered with update conflict errors such as:2024-07-24T13:06:33Z ERROR Reconciler error {"controller": "AppWrapper", "controllerGroup": "workload.codeflare.dev", "controllerKind": "AppWrapper", "AppWrapper": {"name":"kevin1-team-hw","namespace":"kevin1-team"}, "namespace": "kevin1-team", "name": "kevin1-team-hw", "reconcileID": "b6e57167-a357-4c67-85d1-f455e2b57ab6", "error": "Operation cannot be fulfilled on appwrappers.workload.codeflare.dev \"kevin1-team-hw\": the object has been modified; please apply your changes to the latest version and try again"}
These update conflicts result from trying to update stale Kubernetes object revisions in etcd when multiple reconciliers (or users) are concurrently working on cached copies of these objects. These conflicts are harmless. They are handled by retrying the reconciliation loop, refreshing the cached object, and updating or patching the more recent revision. This process is entirely handled by the controller runtime but it involves returning the conflict error to the controller runtime to trigger these retries. Unfortunately, the controller runtime as a result unconditionally logs these harmless conflicts as errors, which is confusing users.
This PR therefore wraps the controller runtime logger with a filter that downgrades these log messages from
ERROR
toDEBUG
messages, more accurately matching the gravity of the event.