Closed DharmitD closed 1 year ago
This PR is blocked until https://github.com/red-hat-data-services/odh-deployer/pull/314 and https://gitlab.cee.redhat.com/service/managed-tenants-sops/-/merge_requests/82 are merged. Once they're merged, this PR will be rebased and marked ready for review.
Remember to update the prometheus init-container (wait-for-deployment) in order to wait until the data science pipeline operator is active, or the alert will fire
Remember to update the prometheus init-container (wait-for-deployment) in order to wait until the data science pipeline operator is active, or the alert will fire
Done, updated the prometheus init container.
Same for is needed for blackbox-exporter's init container (I just saw it). Also, if you call curl with parameter -sS
in the init container the logs are cleaner
Same for is needed for blackbox-exporter's init container (I just saw it). Also, if you call curl with parameter
-sS
in the init container the logs are cleaner
Done, updated to have these changes, and rebased to main.
Tested the changes : The alerts fires info level alerts
Process for testing:
Retested with changes:
Waited for all 3 alerts to fire:
Process for testing:
After applying the rules. Scale down the service. Alerts started firing.
/lgtm
/approve
From @jgarciao :
"If you are confident with PR319 and merge it, it will be included in a pre RC build DevOps will create on Monday (so I'll be able to test that with all the changes)" ~ Jorge
/label qe-approved
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: harshad16, HumairAK
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Description
Adding alerting rules for the Data Science Pipelines Operator.
How Has This Been Tested?
Merge criteria:
[UPSTREAM]
has been prepended to the commit message.