pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.41k stars 5.85k forks source link

Predicate push down caused unexpected non-optimal plan #38784

Open winoros opened 2 years ago

winoros commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Explain the query-9 of chbenchmark.

select 
  n_name, 
  extract(
    year 
    from 
      o_entry_d
  ) as l_year, 
  sum(ol_amount) as sum_profit 
from 
  item, 
  stock, 
  supplier, 
  order_line, 
  orders, 
  nation 
where 
  ol_i_id = s_i_id 
  and ol_supply_w_id = s_w_id 
  and mod(
    (s_w_id * s_i_id), 
    10000
  ) = s_suppkey 
  and ol_w_id = o_w_id 
  and ol_d_id = o_d_id 
  and ol_o_id = o_id 
  and ol_i_id = i_id 
  and s_nationkey = n_nationkey 
  and i_data like '%BB' 
group by 
  n_name, 
  extract(
    year 
    from 
      o_entry_d
  ) 
order by 
  n_name, 
  l_year desc;

2. What did you expect to see? (Required)

As you can see, there's ol_i_id = s_i_id, ol_i_id = i_id, and mod((s_w_id * s_i_id), 10000) = s_suppkey. We can get s_i_id = i_id from the equivalence class propagation. Then TiDB would use that to generate mod((s_w_id * i_id), 10000) = s_suppkey.

It's an equal function though one side is a scalar function. TiDB only accepts col_x = col_y as the join key. So TiDB would add a projection to project mod((s_w_id * i_id), 10000) = s_suppkey as Col#x. Then let mod((s_w_id * i_id), 10000) = s_suppkey become Col#x = s_suppkey.

So far the optimization sounds good. However, this projection cut the join tree into two parts, and unfortunately, the cut part is a cartesian product of table item and table stock. Finally, the SQL got a bad performance than we expected.

3. What did you see instead (Required)

No cartesian product in the final plan.

4. What is your TiDB version? (Required)

current master.

fixdb commented 2 years ago

Hmm, maybe we should propagate equiv. class for expression.