We need a good way to have an alert which occurs when there are errors in the logs, but we don't want to include errors from resource version conflicts which are transient.
The only way I can think of to do this is to use a Splunk Alert which looks for "error" level but filters out the resource version conflict errors. (From the log I'm looking at now, I see this text: "the object has been modified; please apply your changes to the latest version and try again". I am not sure whether this is always the exact text or not.)
According to @calvinx408, Splunk has the concept of a "federated alert". We can request splunk team (@avaz in Slack) to create the alert across all of splunk for our Numaflow AddOn, index "iks".
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
Summary
We need a good way to have an alert which occurs when there are errors in the logs, but we don't want to include errors from resource version conflicts which are transient.
The only way I can think of to do this is to use a Splunk Alert which looks for "error" level but filters out the resource version conflict errors. (From the log I'm looking at now, I see this text: "the object has been modified; please apply your changes to the latest version and try again". I am not sure whether this is always the exact text or not.)
According to @calvinx408, Splunk has the concept of a "federated alert". We can request splunk team (@avaz in Slack) to create the alert across all of splunk for our Numaflow AddOn, index "iks".
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.