We want to create a full automated test on validating prometheus alert rules, i.e., when a PR is made, test is ran and checks whether the changes will cause the alerts to fire.
The alert rules are defined in the odh-deployer prometheus config-map, while the components such as CodeFlare Operator, MCAD, and others are in their own separate repos, making it more challenging to run these tests against all components.
Questions that come to mind:
Should this test be included in the e2e test or on its own.
Should the test be ran on each component including the odh-deployer repo?
How would the test check if changes on one repo would cause alerts (defined in a odh-deployer repo) to fire?
Suggestions
RobotFramework: Generic automation framework for acceptance testing and RPA. There is test automation for RHODS. There are some tests verifying the existing alerts in ods-ci
PromTool: Tooling for the Prometheus monitoring system. This tool can check the rules for syntax errors, it can run unit tests, which I suppose if we run the unit tests as part of the e2e tests, would that work to verify that the alert is working as expected?
Some context
We want to create a full automated test on validating prometheus alert rules, i.e., when a PR is made, test is ran and checks whether the changes will cause the alerts to fire.
The alert rules are defined in the odh-deployer prometheus config-map, while the components such as CodeFlare Operator, MCAD, and others are in their own separate repos, making it more challenging to run these tests against all components.
Questions that come to mind:
Suggestions