This commit introduces more ope status alerts in the 'rhods-notebooks' namespace within the 'nerc-ocp-prod' cluster. The alerts are designed to be triggered at 6am and send to Slack channel alerts-prod-rhods-ope, focusing on ephemeral storage, memory usage, PVC claims, storage requests, container counts, and pod owner counts.
Changes made:
Rules:
Added alerts for monitoring the percentage of limit used for ephemeral storage, memory, PVCs, and storage requests at 6am.
Added alerts for counting containers and pod owners at 6am, providing a snapshot of resource utilization.
Based on time trigger, providing daily insights into resource usage patterns before classes start.
Configuration:
Routing to 'slack-notifications-prod-rhods-ope' receiver.
Alerts matching ^Custom6amOpe.* to catch all new rules from 1.
Looks good. Are you addressing alerts for the timeouts we observed recently elsewhere, or do we not have sufficient info to create an alert on that yet?
This commit introduces more ope status alerts in the 'rhods-notebooks' namespace within the 'nerc-ocp-prod' cluster. The alerts are designed to be triggered at 6am and send to Slack channel alerts-prod-rhods-ope, focusing on ephemeral storage, memory usage, PVC claims, storage requests, container counts, and pod owner counts.
Changes made:
Rules:
Configuration:
^Custom6amOpe.*
to catch all new rules from 1.