Open arthurschreiber opened 1 week ago
@arthurschreiber this was merged into a single route in pre-18 Vitess releases?
No, but it was handled "correctly" by the v3
planner in earlier versions (probably more by luck than on purpose).
We found an additional similar example of this behavior, see below:
select music.id
from music where music.id in (
select * from (
select music.id from music where music.user_id in (1, 2, 3) order by music.updated_at desc, music.id desc limit 0, 100
) as subquery_for_limit
) order by music.updated_at desc, music.id desc limit 100
which generates the below plan:
{
"QueryType": "SELECT",
"Original": "select music.id from music where music.id in (select * from (select music.id from music where music.user_id in (1, 2, 3) order by music.updated_at desc, music.id desc limit 0, 100) as subquery_for_limit) order by music.updated_at desc, music.id desc limit 100",
"Instructions": {
"OperatorType": "Limit",
"Count": "INT64(100)",
"Inputs": [
{
"OperatorType": "UncorrelatedSubquery",
"Variant": "PulloutIn",
"PulloutVars": [
"__sq_has_values",
"__sq1"
],
"Inputs": [
{
"InputName": "SubQuery",
"OperatorType": "Limit",
"Count": "INT64(100)",
"Offset": "INT64(0)",
"Inputs": [
{
"OperatorType": "Route",
"Variant": "IN",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select id, subquery_for_limit.updated_at, weight_string(subquery_for_limit.updated_at), weight_string(id) from (select music.id, music.updated_at from music where 1 != 1) as subquery_for_limit where 1 != 1",
"OrderBy": "(1|2) DESC, (0|3) DESC",
"Query": "select id, subquery_for_limit.updated_at, weight_string(subquery_for_limit.updated_at), weight_string(id) from (select music.id, music.updated_at from music where music.user_id in ::__vals) as subquery_for_limit order by subquery_for_limit.updated_at desc, subquery_for_limit.id desc limit :__upper_limit",
"Table": "music",
"Values": [
"(INT64(1), INT64(2), INT64(3))"
],
"Vindex": "user_index"
}
]
},
{
"InputName": "Outer",
"OperatorType": "Route",
"Variant": "IN",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select music.id, music.updated_at, weight_string(music.updated_at), weight_string(music.id) from music where 1 != 1",
"OrderBy": "(1|2) DESC, (0|3) DESC",
"Query": "select music.id, music.updated_at, weight_string(music.updated_at), weight_string(music.id) from music where :__sq_has_values and music.id in ::__vals order by music.updated_at desc, music.id desc",
"ResultColumns": 1,
"Table": "music",
"Values": [
"::__sq1"
],
"Vindex": "music_user_map"
}
]
}
]
},
"TablesUsed": [
"user.music"
]
}
Even though we're joining on a lookup vindex column, this query is broken up into two pieces and performs a lookup vindex operation.
Overview of the Issue
On all currently supported Vitess versions (v18 to v21), queries that join a derived table (via a vindex) containing a
UNION
lead to an inefficient query plan performing aJOIN
on thevtgate
instead of pushing down theJOIN
operation to MySQL.Reproduction Steps
Here's an example query:
music_extra.music_id
andmusic.id
use the same lookup vindex owned bymusic
.music.user_id
is a hash vindex.Generates the following plan on v18 - v21:
The same issue occurs when joining on the
hash
vindex directly:Binary Version
Operating System and Environment details
Log Fragments