Problem of aligning boundary strips with real traffic

hi @floppy84 , thank you for reporting this issue. The delay is a byproduct of the usage of z-score for anomaly detection. The mid-line used as a reference to draw the baselines is calculated using a moving average, which smooths out the metric but also introduces a delay. In a way, this is by design so that sudden changes fall outside the baselines and are detected as anomalies before the baselines have the chance to catch up (the algorithm is tuned for detecting short term anomalies).

If this is not acceptable in your case, you could tune the size of the time window for the moving average (by default 1h). A large time window will make the mid-line react slowly to trend changes (leading to a more sensitive algorithm), while a shorter time window will make the mid-line track your metric more closely, at the cost of reduced sensitivity and noise.

Another option could be to replace the moving average with an exponential moving average, which gives more importance to recent data points, but it could be computationally expensive and we have not explored this path (you could probably approximate a moving average using the built-in holt winters function in prometheus).

I hope this helps and please let us know if you found a solution that worked for you!

grafana / promql-anomaly-detection

Problem of aligning boundary strips with real traffic #5