ydb-platform / ydb

YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
https://ydb.tech
Apache License 2.0
4.02k stars 587 forks source link

Equality Filter does not change (decrease) Cardinality (TPC-H Q3) #10135

Open Hor911 opened 1 month ago

Hor911 commented 1 month ago

For example, TPC-H Q3

┌─────────────────────────────────────────────────────────────────┬───────┬────────┬───────────┬───────────┬───────────┐
│ Operation                                                       │ A-Cpu │ A-Rows │ E-Cost    │ E-Rows    │ E-Size    │
├─────────────────────────────────────────────────────────────────┼───────┼────────┼───────────┼───────────┼───────────┤
│  -> ResultSet                                                   │       │        │           │           │           │
│    -> TopSort (Limit: 10, TopSortBy: )                          │       │        │           │           │           │
│      -> Top (Limit: 10, TopBy: )                                │       │        │           │           │           │
│        -> Aggregate (GroupBy: , Aggregation: {_yql_agg_0: SUM(l │       │        │           │           │           │
│ _extendedprice * (1 - l_discount),_yql_agg_0)})                 │       │        │           │           │           │
│          -> InnerJoin (Grace) (l.l_orderkey = o_1.o_orderkey)   │       │        │ 1.170e+10 │ 1.500e+09 │ 2.411e+10 │
│            -> Filter (l_shipdate > "9197")                      │       │        │ 0         │ 3.000e+09 │ 4.822e+10 │
│              -> TableFullScan (Table: dt64/column/tpch/s1000/li │       │        │ 0         │ 6.000e+09 │ 9.645e+10 │
│ neitem, Scan: Parallel, ReadRanges: ["l_orderkey (-∞, +∞)","l_l │       │        │           │           │           │
│ inenumber (-∞, +∞)"], ReadColumns: ["l_discount","l_extendedpri │       │        │           │           │           │
│ ce","l_orderkey","l_shipdate"])                                 │       │        │           │           │           │
│            -> LeftSemiJoin (Grace) (o_1.o_custkey = c.c_custkey │       │        │ 2.700e+09 │ 7.500e+08 │ 2.028e+10 │
│ )                                                               │       │        │           │           │           │
│              -> Filter (o_orderdate < "9197")                   │       │        │ 0         │ 7.500e+08 │ 2.028e+10 │
│                -> TableFullScan (Table: dt64/column/tpch/s1000/ │       │        │ 0         │ 1.500e+09 │ 4.055e+10 │
│ orders, Scan: Parallel, ReadRanges: ["o_orderkey (-∞, +∞)"], Re │       │        │           │           │           │
│ adColumns: ["o_custkey","o_orderdate","o_orderkey","o_shipprior │       │        │           │           │           │
│ ity"])                                                          │       │        │           │           │           │
│              -> Filter (c_mktsegment == MACHINERY)              │       │        │ 0         │ 1.500e+08 │ 3.934e+09 │
│                -> TableFullScan (Table: dt64/column/tpch/s1000/ │       │        │ 0         │ 1.500e+08 │ 3.934e+09 │
│ customer, Scan: Parallel, ReadRanges: ["c_custkey (-∞, +∞)"], R │       │        │           │           │           │
│ eadColumns: ["c_custkey","c_mktsegment"])                       │       │        │           │           │           │
└─────────────────────────────────────────────────────────────────┴───────┴────────┴───────────┴───────────┴───────────┘
pashandor789 commented 1 month ago

it works in main