cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

sql: expensive full table scan and sort can be avoided if we explore non-covering indexes #77243

Open cucaroach opened 2 years ago

cucaroach commented 2 years ago

If a table has a non-covering index that satisifies an orderby constraint we only consider it if an index hint is present (see GenerateIndexScans).

Can we be smarter here and explore non-covering indexes if they satisfy an ordering? Does it matter if there's a LIMIT hint?

Inspired by support issue https://github.com/cockroachlabs/support/issues/1453

Jira issue: CRDB-13497

cucaroach commented 2 years ago

I feel like the limit hint shouldn't be considered, if the limit is huge or not there a streaming plan that pulls from the index will outperform a full table scan and sort.

If the fact that we don't explore non-covering index is a complexity avoidance measure maybe we need more intelligent complexity avoidance, could we have a optimization combinatorial complexity measure and only disable things like this when that complexity is over a certain threshold? Like maybe number of indexes number of columns number of joins in the statement? Obviously a half baked idea.

Besides generating non-covering index scans are there other examples or paths we don't go down that in some cases result in sub-optimal queries?

cucaroach commented 2 years ago

This policy originates here: https://github.com/cockroachdb/cockroach/commit/f3e60fe3a1e52433858d6d0739424fde12d6217b

Although the notes say:

The change is to generate IndexJoins "one level up", only when a Scan is
wrapped in a Select or Limit. At this point, they become viable plans that
have a reasonable chance of being selected.

I don't understand how this is implemented but I'm guessing because the limit is above the join?

cucaroach commented 2 years ago

Here's a simplified and cleanup repro:

exec-ddl
CREATE TABLE t1 (
    id UUID NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    number INT8 NOT NULL,
    hash STRING NOT NULL,
    UNIQUE INDEX number_idx (number ASC),
    INDEX hash_idx (hash ASC)
)
----

exec-ddl
CREATE TABLE t2 (
    number INT8 NOT NULL PRIMARY KEY,
    hash STRING NULL,
    INDEX hash_index (hash ASC)
)
----

# Disable GenerateTopK so we can see what 21.2 did
opt disable=GenerateTopK
SELECT
  a.hash, a.number
FROM
  t1 AS a
  INNER JOIN t2 AS b ON a.hash = b.hash
ORDER BY
  a.number DESC
LIMIT
  1
----
project
 ├── columns: hash:3!null number:2!null
 └── limit
      ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
      ├── internal-ordering: -2
      ├── inner-join (lookup t2@hash_index [as=b])
      │    ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
      │    ├── key columns: [3] = [7]
      │    ├── ordering: -2
      │    ├── limit hint: 1.00
      │    ├── sort
      │    │    ├── columns: a.number:2!null a.hash:3!null
      │    │    ├── ordering: -2
      │    │    ├── limit hint: 100.00
      │    │    └── scan t1 [as=a]
      │    │         └── columns: a.number:2!null a.hash:3!null
      │    └── filters (true)
      └── 1

# Disable GenerateTopK so we can see what 21.2 did
opt disable=GenerateTopK
SELECT
  a.hash, a.number
FROM
  t1@number_idx AS a
  INNER JOIN t2 AS b ON a.hash = b.hash
ORDER BY
  a.number DESC
LIMIT
  1
----
project
 ├── columns: hash:3!null number:2!null
 └── limit
      ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
      ├── internal-ordering: -2
      ├── inner-join (lookup t2@hash_index [as=b])
      │    ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
      │    ├── key columns: [3] = [7]
      │    ├── ordering: -2
      │    ├── limit hint: 1.00
      │    ├── index-join t1
      │    │    ├── columns: a.number:2!null a.hash:3!null
      │    │    ├── ordering: -2
      │    │    ├── limit hint: 100.00
      │    │    └── scan t1@number_idx,rev [as=a]
      │    │         ├── columns: id:1!null a.number:2!null
      │    │         ├── flags: force-index=number_idx
      │    │         ├── ordering: -2
      │    │         └── limit hint: 100.00
      │    └── filters (true)
      └── 1
cucaroach commented 2 years ago

Note that underlying customer issue appears resolved by adding hash column to number_idx as a stored column but a mystery remains why index joins are so expensive.

cucaroach commented 2 years ago

Chatted w/ @yuzefovich and hit upon a probably better approach to solving this, right now GenerateLimitedScans only kicks in when there's a hard limit but perhaps it could also kick in when there's a limit hint.

cucaroach commented 2 years ago

Another idea: generate ordering scans, explore non-covering secondary indexes that provide ordering required by the query.

cucaroach commented 2 years ago

Would it make sense to look for something like this:

      │    ├── sort
      │    │    ├── columns: a.number:2!null a.hash:3!null
      │    │    ├── ordering: -2
      │    │    ├── limit hint: 100.00
      │    │    └── scan t1 [as=a]
      │    │         └── columns: a.number:2!null a.hash:3!null

And match (Sort (Scan) $ordering) with an exploration rule that specifically looked for non-covering secondary indexes that matched the sort ordering? @mgartner @rytaft I guess this would be a cousin of GenerateLimitedScans. I tried it and opt didn't like my Sort expression...

msirek commented 2 years ago

Note that underlying customer issue appears resolved by adding hash column to number_idx as a stored column but a mystery remains why index joins are so expensive.

Maybe vectorized execution is more expensive for lookup joins in this case because it does a batch of 100 rows at a time. Here is where it's setting the limit hint size as a multiple of the joinReaderBatchSize: https://github.com/cockroachdb/cockroach/blob/b01d6c7e57c3e2dafb4194d3788253ad3b988610/pkg/sql/opt/xform/coster.go#L1631-L1642

Maybe experimenting with different joinReaderBatchSizes and also trying non-vectorized execution would find the fastest response for the single-row case. Then maybe we could dynamically choose between vectorized and non-vectorized. Say, if we want a small limit hint of 10 rows or less, choose non-vectorized. Also, I wonder if the limit hint is required to be a multiple of joinReaderBatchSize.

yuzefovich commented 2 years ago

Maybe vectorized execution is more expensive for lookup joins in this case because it does a batch of 100 rows at a time.

We currently don't have native vectorized execution support for lookup joins and fallback to the row-by-row joinReader for those operations. The joinReader still does batching itself, but we no longer use 100 rows as the batch size - we use memory footprints for that (either 10KiB or 2MiB).

msirek commented 2 years ago

@cucaroach #87894 may fix this. Maybe close as a dup.

mgartner commented 9 months ago

This commit seems related: https://github.com/cockroachdb/cockroach/commit/60c07a90a359460078765f3f4b66defcdae5b471

yuzefovich commented 9 months ago

Is that linked commit right? It seems to be sqlsmith specific.

mgartner commented 9 months ago

It's the correct link, though its not obvious. The specific part of the commit this is related to is mentioned in the commit message:

This commit adds session setting unconstrained_non_covering_index_scan_enabled which enables exploration of index access paths which are unconstrained and non-covering.