Closed ElijahLynn closed 3 years ago
K, @indytechcook and I discussed and we arrived at we can use the CMS Engineers Non-Critical. I started to make this change in Ansible and we will need a new "Escalation Policy" in Pager Duty but I don't have the ability to configure it as no teams show up in the drop down (I can create though).
DEV/STAGING do have a separate config block already and we can change it here > ansible/deployment/config/prometheus/rules/cms.rules. I am not seeing how to do this just yet though.
ALERT SiteReachableNonCritical
IF script_success{script=~"cms-login-page-(dev|staging)"} == 0
FOR 5m
LABELS { project="cms", severity="page", scope="application", check="{{ $labels.script }}" }
ANNOTATIONS {
summary = "CMS login page not reachable from vets.gov utility",
description = "The monitor probe to check {{ $labels.script }} failed from the vets.gov utility network. There may be an issue loading content from Drupal for website builds. See https://github.com/department-of-veterans-affairs/va.gov-team-sensitive/blob/master/OnCall/alerts.md#sitereachablecritical"
}
To better state the actual challenge here:
Currently our newly deployed deployment code actually activates maintenance window for all environments which are using "CMS Engineers Critical". We need only PROD to do that. DEV and STAGING should activate a maintenance window in "CMS Engineers Non-Critical".