cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.96k stars 3.79k forks source link

colbuilder: introduce testing framework for meta objects tracking #64256

Open yuzefovich opened 3 years ago

yuzefovich commented 3 years ago

With the merge of #62221 we attempt to precisely attribute "meta" objects to their corresponding tree and try to hand off the responsibility of handling them once a component that is able to take over is created. However, we currently don't have any testing in place that would verify that this tracking is correct - we should add something.

One idea on how to add the testing of this is to introduce EXPLAIN (VEC, DEBUG) (or something similar) that would create the operator tree and would print out each "meta" object associated with the corresponding execinfra.OpNode.


For example, while working on another issue I noticed that we currently have at least one bug in this tracking. Consider the following scenario:

CREATE TABLE t (_timestamptz TIMESTAMPTZ, _interval INTERVAL, _bool BOOL);

EXPLAIN (VEC, VERBOSE) SELECT _timestamptz - _interval FROM t WHERE _bool;
                      info
-------------------------------------------------
  │
  └ Node 1
    └ *colexec.Materializer (1)
      └ *colexec.Columnarizer
        └ *rowexec.noopProcessor
          └ *colexec.Materializer (2)
            └ *colexecutils.BoolVecToSelOp
              └ *colexecutils.selBoolOp
                └ *colexecbase.simpleProjectOp
                  └ *colexecutils.CancelChecker
                    └ *colfetcher.ColBatchScan

Note that this plan contains two materializers (components that we try to give the responsibility over meta objects once the materializers are created). In this example ColBatchScan is at least one metadata source. Ideally, we would give the ownership over it to the materializer (2), but we currently give it to the root materializer (1). What's happening is that we have a Filterer core that we're able to vectorize which is followed by a PostProcessSpec with a render expression that we don't support. The latter is handled by planning via wrapPostProcessSpec which creates an artificial (and incomplete colexecargs.InputWithMetaInfo), so the materializer (2) doesn't get any meta objects to take over.


Here is an example why this imprecise tracking is problematic. Consider the same example as above but with the execution statistics being collected. At the moment of writing, the root materializer (1) will "own" two stats collectors (around the columnarizer and the around the colbatchscan). Now imagine that for some reason selBoolOp encounters a panic in its Init method. The panic will be caught by the materializer (2) which will transition into the draining state right away.

However, the panic is not propagated any further, so from the perspective of the root materializer (1), the initialization of its input Columnarizer succeeded, so the materializer (1) will happily first attempt to g roetws from the columnarizer (there will be none) and then will attempt to retrieve execution stats from both of "its owned" stats collectors. Retrieving the stats from the colbatchscan will fail because it wasn't properly initialized.

Once the tracking is fixed, the materializer (2) will own the stats collector around the colbatchscan and will not attempt to retrieve the stats because the materializer (2) knows that the initialization wasn't proper.

Jira issue: CRDB-6969

github-actions[bot] commented 10 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!