rook / rook

Storage Orchestration for Kubernetes
https://rook.io
Apache License 2.0
12.32k stars 2.69k forks source link

Operator's log is flooded with ROOK_WATCH_FOR_NODE_FAILURE= messages #13704

Closed h323 closed 8 months ago

h323 commented 8 months ago

Is this a bug report or feature request?

Deviation from expected behavior:

Operator's log is flooded with messages like this when logging level is set to "Info":

2024-02-06 09:39:55.305431 I | op-k8sutil: ROOK_WATCH_FOR_NODE_FAILURE="true" (default)

I have Rook installed on a pretty large cluster, and I see hundreds of these messages every hour.

Expected behavior:

ROOK_WATCH_FOR_NODE_FAILURE= messages are not printed to the log when logging level is set to "Info".

How to reproduce it (minimal and precise):

The message is printed from the k8sutil.GetOperatorSetting function, which is called from the handleNodeFailure function (added in #12286), which in turn is called from the onK8sNode function, which is triggered when any of the controllers receives a node add/update event.

Environment:

Madhu-1 commented 8 months ago

@h323 This is fixed and backported to 1.12.3 https://github.com/rook/rook/pull/12679. can you please check with Rook 1.12.3 release? cc @subhamkrai

subhamkrai commented 8 months ago

@h323 I see you are using 1.12.3, please upgrade to a newer version to get the fix. Thanks

h323 commented 8 months ago

Upgrading to 1.12.3 fixed the issue, thanks!

dimm0 commented 4 months ago

I see it in v1.12.11. Any way to fix?

travisn commented 4 months ago

@dimm0 This must be coming from here where the rook-ceph-operator-config configmap is not found. Do you have that configmap? A number of releases ago it was added here. If not found, Rook will fall back to the operator env vars. It's best to move your operator env vars into this configmap, then it should solve this logging issue as well.