Open kevin-v-ngo opened 2 years ago
FYI @dongniwang
We should probably have a similar chart on the fingerprint details page but will defer to you. This is on serverless metrics today:
The new solution should consider the behaviour addressed in #99070
Remaining tasks on this issue is to improve sampling rate, since currently the percentiles are only calculated for statements that got detected on Insights, meaning 100ms or 50ms
We've received feedback to not only surface average latency but also the Max, P99, P90, P50, and Min latencies for a given fingerprint in each aggregation interval. We surface the standard deviation but the user reported that it was an indirect way to detect outliers.
Ideally they'd like to be able to view our P99 latency time-series metrics (or any other P90 metric), go to the statements overview page at the time period (with persisted stats), sort by P99 latency, and identify the statement fingerprint to troubleshoot.
From there, they'd be able to view fingerprint details (execution statistics, unique plans, contention information, outlier execution details, etc.)
Jira issue: CRDB-11358
Epic CRDB-32139