Open jordanlewis opened 6 years ago
This also occurs when using IN
instead of =
in the filter condition, BTW.
We don't (can't?) currently use the results of subqueries for span generation. This is a known limitation of the way we handle subqueries. CC @knz
we could (both with current code and new code) for uncorrelated subqueries by evaluating the subqueries prior to span generation.
A forum user ran into this, here: https://forum.cockroachlabs.com/t/query-too-slow-does-full-scan-when-using-subquery/3740
Just chiming in here. We are running into this issue with an INSERT ... ON CONFLICT ... DO UPDATE SET query. All relevant parts have indexes, but it does a full table scan for the table with primary key instead of just using the primary key index.
It would be trivial to hoist uncorrelated subqueries (see deriveHasHoistableSubquery
where we only hoist subqueries that are correlated). The only reason I did not do that originally is because I was unsure if that was always the best plan. We'd need to do more testing and experimentation to decide if that's true. If it's not, then it gets quite a bit more complex, as we have to either find heuristics to decide whether to hoist, or use exploration rule(s). If we find we have to go that direction, then maybe we can find certain special cases where it's always beneficial to hoist (like when it's possible to constrain an index scan).
https://github.com/cockroachdb/cockroach/pull/69047#pullrequestreview-738495478 is an example of where we'd have benefited from this optimization, though we ended up finding another formulation to give us what we want.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
Not yet addressed.
This seems to be fixed now (as of 24.1):
demo@127.0.0.1:26257/demoapp/defaultdb> create table foo (a int primary key);
CREATE TABLE
Time: 5ms total (execution 4ms / network 1ms)
demo@127.0.0.1:26257/demoapp/defaultdb> explain select * from foo where a=(select 1);
info
-------------------------
distribution: local
vectorized: true
• scan
missing stats
table: foo@foo_pkey
spans: [/1 - /1]
(7 rows)
Time: 8ms total (execution 8ms / network 0ms)
demo@127.0.0.1:26257/demoapp/defaultdb> explain select * from foo where a=(select max(a) from foo);
info
---------------------------------------------------------------------------------------
distribution: local
vectorized: true
• lookup join
│ estimated row count: 1
│ table: foo@foo_pkey
│ equality: (any_not_null) = (a)
│ equality cols are key
│
└── • group (scalar)
│ estimated row count: 1
│
└── • revscan
estimated row count: 1 (100% of the table; stats collected 2 minutes ago)
table: foo@foo_pkey
spans: LIMITED SCAN
limit: 1
(17 rows)
Time: 4ms total (execution 3ms / network 1ms)
I don't think it's addressed generally speaking, only in specific cases.
[triage]: There's an optimizer rule that replaces min/max with a limit, and another that works for equality with a subquery: https://github.com/cockroachdb/cockroach/pull/100881. We might not be able to handle inequalities with subqueries, or other special cases that don't have existing rules.
Here's an example that's still unconstrained on v24.1:
demo@127.0.0.1:26257/demoapp/defaultdb> CREATE TABLE a (a INT PRIMARY KEY);
CREATE TABLE
Time: 3ms total (execution 3ms / network 0ms)
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN SELECT * FROM a WHERE a > (SELECT * FROM a ORDER BY a LIMIT 1);
info
------------------------------------------------------------
distribution: local
vectorized: true
• root
│
├── • filter
│ │ filter: a > @S1
│ │
│ └── • scan
│ missing stats
│ table: a@a_pkey
│ spans: FULL SCAN
│
└── • subquery
│ id: @S1
│ original sql: (SELECT * FROM a ORDER BY a LIMIT 1)
│ exec mode: one row
│
└── • scan
missing stats
table: a@a_pkey
spans: LIMITED SCAN
limit: 1
(23 rows)
Time: 8ms total (execution 7ms / network 0ms)
[triage] possible dupes/related issues: https://github.com/cockroachdb/cockroach/issues/51820 https://github.com/cockroachdb/cockroach/issues/93829 https://github.com/cockroachdb/cockroach/issues/100855
Filter conditions obtained from subqueries don't constrain the spans for a scan.
In the second
explain
, I expect that thescan
should havespans = /1-/1/#
like in the firstexplain
.Looks like this is the same behavior as in 1.1. Maybe this is a known limitation?
@RaduBerinde for triage.
Jira issue: CRDB-5827