Our dashboards currently report indexing delay using the
index_queue_age_seconds metric, which tracks the duration indexing jobs wait
in the queue before being "popped" for indexing. For dot com, we regularly see
the median for this metric at 2 days. However, this metric may be misleading
because it includes "noop" indexing jobs where the index is already up to date.
These jobs can stay in the queue for a long time since they are not high
priority.
This PR adds a new metric index_indexing_delay_seconds, which reports the
duration from when an indexing job enters the queue, to when it completes. It
uses 'state' as a label, so by setting state='success' we can (roughly) see
the delay from when Zoekt learns about a new version, to when that version is
available to search.
Our dashboards currently report indexing delay using the
index_queue_age_seconds
metric, which tracks the duration indexing jobs wait in the queue before being "popped" for indexing. For dot com, we regularly see the median for this metric at 2 days. However, this metric may be misleading because it includes "noop" indexing jobs where the index is already up to date. These jobs can stay in the queue for a long time since they are not high priority.This PR adds a new metric
index_indexing_delay_seconds
, which reports the duration from when an indexing job enters the queue, to when it completes. It uses 'state' as a label, so by settingstate='success'
we can (roughly) see the delay from when Zoekt learns about a new version, to when that version is available to search.