cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.84k stars 3.77k forks source link

Improve observability for performance issues due to the accumulation of MVCC values #72048

Open jordanlewis opened 2 years ago

jordanlewis commented 2 years ago

Performance issues can occur when there are high quantities of MVCC garbage data. MVCC garbage data can lead to downstream issues like resource saturation that we’ve seen in several customer workloads (e.g., the outbox pattern).

We should surface MVCC garbage information specifically in SQL Observability touchpoints such as our internal telemetry, aggregated statement and transaction statistics tables, and the console (SQL Activity pages and Insights).

This issue tracks surfacing MVCC garbage information in our aggregated statement and transaction statistics tables and console pages. We should strive for a simple and explainable metric and UX that points users to MVCC garbage accumulation. Ideally the UX should be consistent with MVCC values surfaced in the Databases page where we describe "Live" and "Non-Live" data. Specifically, we should introduce liveBytesRead and nonLiveBytesRead per execution.

Related issues:

cc @maryliag @kevin-v-ngo @dongniwang

Jira issue: CRDB-10925

Jira issue: CRDB-13485

Epic: CRDB-20499

jordanlewis commented 2 years ago

We'll also want to resolve #71351 somehow if we are to rely on this, or choose to sample the information and not collect it all the time.

ericharmeling commented 1 year ago

Since https://github.com/cockroachdb/cockroach/pull/64503 was merged, there have been a number of other pebble iterator stats added to tracing (see https://github.com/cockroachdb/cockroach/pull/77512 and https://github.com/cockroachdb/cockroach/pull/94345). Some of these statistics might be a little easier to interpret than step/seek information.

So I have a few questions regarding this issue.

  1. Do we really want to expose step/seek information in our statistics tables and console (if yes, then how exactly)? The code comments suggest some sort of comparison with unique MVCC keys, but what kind of comparison and where to pull these unique MVCC keys isn't 100% clear to me.
  2. What do the other iterator stats mean for slow queries (and are they easier to interpret)? @kevin-v-ngo brought up a good point: we should try to avoid introducing new concepts and new units of measurement where possible. We talk about live vs non-live data on the databases page in terms of percentages and bytes. Would it make sense to use the same units/terminology with the iterator statistics in tracing?
  3. Should we surface all of the iterator stats in EXPLAIN ANALYZE/virtual tables, or just the meaningful ones? We could just surface all sampled iterator stats available via tracing (keeping https://github.com/cockroachdb/cockroach/issues/71351 in mind) in their "raw" form, and then add additional fields that combine the stats with each other or with other stats so they are a little easier to interpret.
  4. How can we combine these iterator stats with other statistics to make them more meaningful and easier to interpret in our virtual tables and in the console?

An additional thought: The current title of this issue suggests that we treat this issue as part of https://github.com/cockroachdb/cockroach/issues/77580, which calls for all pebble iterator stats to be added to EXPLAIN ANALYZE and execution_statistics (I assume in a format similar to that in which they are exposed from pebble). So perhaps we need to rename this issue "sql: improve MVCC iterator stats observability" or something like that?

kevin-v-ngo commented 1 year ago

Based on my discussion with @ericharmeling , i've updated the issue to also reflect the investigation on surfacing (new) iterator stats that are easily understandable by end users.

ericharmeling commented 1 year ago

We could just surface all sampled iterator stats available via tracing (keeping https://github.com/cockroachdb/cockroach/issues/71351 in mind) in their "raw" form, and then add additional fields that combine the stats with each other or with other stats so they are a little easier to interpret.

This sounds like the most straightforward approach.

  1. Add all iterator stats sampled from tracing to exec stats (i.e., resolve https://github.com/cockroachdb/cockroach/issues/77580).
  2. Add status endpoint functions to process relevant iterator stats (i.e., PointCount/PointsCoveredByRangeTombstones) for exposure to the statement_statistics and transaction_statistics tables.
  3. Add new column(s) to SQL Activity page for relevant storage stats.

~The concern with this approach is that it ignores https://github.com/cockroachdb/cockroach/issues/71351. So perhaps in resolving https://github.com/cockroachdb/cockroach/issues/77580, we look into improving the exec stats recording by following one of the suggestions in that issue. This one is the easiest for me to understand at the moment:~

~do we want to sample all concurrent queries as "first"? can we improve so that we collect the sample of only the truly first? we'll need to make sure if we do collect the sample on the very first query, if that run is unsuccessful, we'll try sampling it later.~


~Update: I ran some benchmark tests, per https://github.com/cockroachdb/cockroach/issues/71351. See https://github.com/cockroachdb/cockroach/pull/96016#issuecomment-1405606891 for more details.~

glennfawcett commented 3 months ago

It would seem that we could have the ratio of point count to seek count displayed as a metric. Where statements with high values could be displayed in a dashboard. Below are two different versions of the same statement after GC collect.

select 
  cast(statistics->'execution_statistics'->'mvccIteratorStats'->'pointCount'->'mean' as INTEGER) as pointCount,
  cast(statistics->'execution_statistics'->'mvccIteratorStats'->'seekCount'->'mean' as INTEGER) as seekCount
from crdb_internal.statement_statistics 
where fingerprint_id='\x9b1ad34c7012c1fb';

  pointcount | seekcount
-------------+------------
           5 |         2
     4980000 |         2