Open shermanCRL opened 3 years ago
Thinking that the structure might something like:
ProgressMetric
Label string // typically units like ‘ranges’ or ‘bytes'
Numerator float // units completed so far
Denominator float // total units to be completed (may be an estimate)
LastUpdated time // last time this metric was updated
StartTime? time, nullable // when we started executing this phase
...and at the top level, there would be an array of ProgressMetric
.
Having this primitive would allow job implementors to do cool stuff.
ProgressMetric
as a distinct phase in a multi-phase job. Maybe the first ProgressMetric
is “planning”, another is “writing”, another is “validating”Denominator - Numerator / (now - StartTime)
Problem
Currently, the progress metric of a job is a single percentage number, and doesn’t tell the user what measurement contributes to that percentage. “30% of what?” they might ask.
A common user story is observing a job, but struggling to know if it’s slow vs stalled, or which parts have completed, or what “complete” means.
Desired solution
A facility to record & display a numerator and denominator in the progress of a job, with arbitrary labels (i.e. units). For example, “230MB of 500MB complete”.
This is intended for DB Console UI and end-user observability. (Maybe Prometheus too?) It is not intended as a “functional” metric for internal state tracking, but if it turns out useful elsewhere, great.
I have to believe that when we calculate those % completes, we have a numerator and denominator, right? Let’s let the user see them.
(Further ambition: make it a time series so users can observe slowdowns or long tails.)
Complementary idea:
Jira issue: CRDB-8696
Epic CRDB-32144