dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

Configuration hook for generating Prometheus alerts in MSTransferor #12103

Open amaltaro opened 1 week ago

amaltaro commented 1 week ago

Impact of the new feature MSTransferor

Is your feature request related to a problem? Please describe. MSTransferor can generate alerts for misconfigured workflows, which are delivered to our email inboxes. Here is one example [1]. We should be able to configure whether we want these alert/notifications to be created or not.

Describe the solution you'd like In addition to checking with the Monitoring team who is supposed to be getting such alerts, we need to implement the following features:

Describe alternatives you've considered Keep dealing with these unwanted alerts.

Additional context [1]

Title example: [FIRING:1] ms-transferor: PU misconfiguration error. Workflow: user_TaskChain_Prod_SiteListsTest_v5_240919_144211_3106 (ms-transferor high wmcore)

Labels
alertname = ms-transferor: PU misconfiguration error. Workflow: user_TaskChain_Prod_SiteListsTest_v5_240919_144211_3106
service = ms-transferor
severity = high
tag = wmcore
Annotations
description = Workflow: user_TaskChain_Prod_SiteListsTest_v5_240919_144211_3106 could not proceed due to some PU misconfiguration,so it will be skipped.
hostname = ms-transferor-7dfc9d6ffc-zvqf8
summary = [MSTransferor] Workflow cannot proceed due to some PU misconfiguration.