The FluentdQueueLength alarm measures the rate(fluentd_status_buffer_queue_length[5m]) and will push to warning if >0.3 or to critical if >0.5.
Although this provides in most situations an adequate alert, there are cases where this may be not the best indicator, particularly if there are burst periods of logging that quickly increase the queue size (and if this queue size is bigger). On the other hand, being a 5 minutes average, it can have a significant delay in expressing the error.
A more precise indicator could take into account the absolute queue size value and trigger alerts when the limit approaches and the risk of losing messages is high.
The
FluentdQueueLength
alarm measures therate(fluentd_status_buffer_queue_length[5m])
and will push towarning
if>0.3
or tocritical
if>0.5
.Although this provides in most situations an adequate alert, there are cases where this may be not the best indicator, particularly if there are burst periods of logging that quickly increase the queue size (and if this queue size is bigger). On the other hand, being a 5 minutes average, it can have a significant delay in expressing the error.
A more precise indicator could take into account the absolute queue size value and trigger alerts when the limit approaches and the risk of losing messages is high.