Open cucaroach opened 2 years ago
I feel like the limit hint shouldn't be considered, if the limit is huge or not there a streaming plan that pulls from the index will outperform a full table scan and sort.
If the fact that we don't explore non-covering index is a complexity avoidance measure maybe we need more intelligent complexity avoidance, could we have a optimization combinatorial complexity measure and only disable things like this when that complexity is over a certain threshold? Like maybe number of indexes number of columns number of joins in the statement? Obviously a half baked idea.
Besides generating non-covering index scans are there other examples or paths we don't go down that in some cases result in sub-optimal queries?
This policy originates here: https://github.com/cockroachdb/cockroach/commit/f3e60fe3a1e52433858d6d0739424fde12d6217b
Although the notes say:
The change is to generate IndexJoins "one level up", only when a Scan is
wrapped in a Select or Limit. At this point, they become viable plans that
have a reasonable chance of being selected.
I don't understand how this is implemented but I'm guessing because the limit is above the join?
Here's a simplified and cleanup repro:
exec-ddl
CREATE TABLE t1 (
id UUID NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
number INT8 NOT NULL,
hash STRING NOT NULL,
UNIQUE INDEX number_idx (number ASC),
INDEX hash_idx (hash ASC)
)
----
exec-ddl
CREATE TABLE t2 (
number INT8 NOT NULL PRIMARY KEY,
hash STRING NULL,
INDEX hash_index (hash ASC)
)
----
# Disable GenerateTopK so we can see what 21.2 did
opt disable=GenerateTopK
SELECT
a.hash, a.number
FROM
t1 AS a
INNER JOIN t2 AS b ON a.hash = b.hash
ORDER BY
a.number DESC
LIMIT
1
----
project
├── columns: hash:3!null number:2!null
└── limit
├── columns: a.number:2!null a.hash:3!null b.hash:7!null
├── internal-ordering: -2
├── inner-join (lookup t2@hash_index [as=b])
│ ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
│ ├── key columns: [3] = [7]
│ ├── ordering: -2
│ ├── limit hint: 1.00
│ ├── sort
│ │ ├── columns: a.number:2!null a.hash:3!null
│ │ ├── ordering: -2
│ │ ├── limit hint: 100.00
│ │ └── scan t1 [as=a]
│ │ └── columns: a.number:2!null a.hash:3!null
│ └── filters (true)
└── 1
# Disable GenerateTopK so we can see what 21.2 did
opt disable=GenerateTopK
SELECT
a.hash, a.number
FROM
t1@number_idx AS a
INNER JOIN t2 AS b ON a.hash = b.hash
ORDER BY
a.number DESC
LIMIT
1
----
project
├── columns: hash:3!null number:2!null
└── limit
├── columns: a.number:2!null a.hash:3!null b.hash:7!null
├── internal-ordering: -2
├── inner-join (lookup t2@hash_index [as=b])
│ ├── columns: a.number:2!null a.hash:3!null b.hash:7!null
│ ├── key columns: [3] = [7]
│ ├── ordering: -2
│ ├── limit hint: 1.00
│ ├── index-join t1
│ │ ├── columns: a.number:2!null a.hash:3!null
│ │ ├── ordering: -2
│ │ ├── limit hint: 100.00
│ │ └── scan t1@number_idx,rev [as=a]
│ │ ├── columns: id:1!null a.number:2!null
│ │ ├── flags: force-index=number_idx
│ │ ├── ordering: -2
│ │ └── limit hint: 100.00
│ └── filters (true)
└── 1
Note that underlying customer issue appears resolved by adding hash column to number_idx as a stored column but a mystery remains why index joins are so expensive.
Chatted w/ @yuzefovich and hit upon a probably better approach to solving this, right now GenerateLimitedScans only kicks in when there's a hard limit but perhaps it could also kick in when there's a limit hint.
Another idea: generate ordering scans, explore non-covering secondary indexes that provide ordering required by the query.
Would it make sense to look for something like this:
│ ├── sort
│ │ ├── columns: a.number:2!null a.hash:3!null
│ │ ├── ordering: -2
│ │ ├── limit hint: 100.00
│ │ └── scan t1 [as=a]
│ │ └── columns: a.number:2!null a.hash:3!null
And match (Sort (Scan) $ordering) with an exploration rule that specifically looked for non-covering secondary indexes that matched the sort ordering? @mgartner @rytaft I guess this would be a cousin of GenerateLimitedScans. I tried it and opt didn't like my Sort expression...
Note that underlying customer issue appears resolved by adding hash column to number_idx as a stored column but a mystery remains why index joins are so expensive.
Maybe vectorized execution is more expensive for lookup joins in this case because it does a batch of 100 rows at a time. Here is where it's setting the limit hint size as a multiple of the joinReaderBatchSize: https://github.com/cockroachdb/cockroach/blob/b01d6c7e57c3e2dafb4194d3788253ad3b988610/pkg/sql/opt/xform/coster.go#L1631-L1642
Maybe experimenting with different joinReaderBatchSize
s and also trying non-vectorized execution would find the fastest response for the single-row case. Then maybe we could dynamically choose between vectorized and non-vectorized. Say, if we want a small limit hint of 10 rows or less, choose non-vectorized. Also, I wonder if the limit hint is required to be a multiple of joinReaderBatchSize.
Maybe vectorized execution is more expensive for lookup joins in this case because it does a batch of 100 rows at a time.
We currently don't have native vectorized execution support for lookup joins and fallback to the row-by-row joinReader
for those operations. The joinReader
still does batching itself, but we no longer use 100 rows as the batch size - we use memory footprints for that (either 10KiB or 2MiB).
@cucaroach #87894 may fix this. Maybe close as a dup.
This commit seems related: https://github.com/cockroachdb/cockroach/commit/60c07a90a359460078765f3f4b66defcdae5b471
Is that linked commit right? It seems to be sqlsmith specific.
It's the correct link, though its not obvious. The specific part of the commit this is related to is mentioned in the commit message:
This commit adds session setting unconstrained_non_covering_index_scan_enabled which enables exploration of index access paths which are unconstrained and non-covering.
If a table has a non-covering index that satisifies an orderby constraint we only consider it if an index hint is present (see GenerateIndexScans).
Can we be smarter here and explore non-covering indexes if they satisfy an ordering? Does it matter if there's a LIMIT hint?
Inspired by support issue https://github.com/cockroachlabs/support/issues/1453
Jira issue: CRDB-13497