Open tiancaiamao opened 3 months ago
When we estimate the cost of a join, it's cost of A + cost of B + estimated rows of the join result X This looks reasonable.
X
/ \
A B
How to estimate the cost of A? especially when A is Limit + TableScan We use the estimated rows + cost of its children, this is unreasonable considering the limit case.
select * from rel ;; rel has 1e7 rows
select * from info ;; info has 1.2e7 rows
select * from rel limit 20 ;; the result set has just 20 rows
(select * from rel limit 20) JOIN rel JOIN info
For join reorder using greedy algorithm, apparently we should choose (select * from rel limit 20) as the start point.
But due to the wrong cost calculation of (select * from rel limit 20), info is choosed!
It's a known limitation of current logical cost.
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
3. What did you see instead (Required)
I expect the join order for that query is: rel (limit20) INNER JOIN rel LEFT JOIN info But the actual order is rel LEFT JOIN info INNER JOIN rel (limit 20)
4. What is your TiDB version? (Required)
master