We should figure out some loose parameters for which anomaly monitors are closest to "right" and standardize accordingly. If we leave teams to do this on their own, they'll probably do something simple, get annoyed by false positives, tune down the threshold til it's meaningless, then miss actual traffic anomalies.
[ ] Capture some data over time about which anomaly monitors are generating false positives (and by exclusion, which ones are not). In other words which ones are triggering but it seems pretty confidently like noise/ordinary variation.
[ ] Based on above, decide on a baseline anomaly monitor pattern to use, and give teams guidance about when they might want to deviate (e.g. pick a different threshold, pick a different time window).
[ ] Optionally, go back and proactively standardize the existing anomaly monitors.
Various teams have settled on various approaches to traffic anomaly monitors:
anomalies
function which takes into account weekly periodicity of traffic.We should figure out some loose parameters for which anomaly monitors are closest to "right" and standardize accordingly. If we leave teams to do this on their own, they'll probably do something simple, get annoyed by false positives, tune down the threshold til it's meaningless, then miss actual traffic anomalies.