trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.2k stars 2.94k forks source link

Incorrect LEFT JOIN behavior when combined with WHERE starting in v380 #13109

Closed alec-heif closed 2 years ago

alec-heif commented 2 years ago

We're seeing a regression beginning in version 380 with correctness of complex join predicates.

Here is a minimal repro. The following query should (and does on 379) return the row from the left side, but in 380+ it returns no results.

SELECT lt.*, rt.*
FROM
     (SELECT * 
     FROM (VALUES (1,'foo')) "unused" (col_left_1, col_left_2)
) AS "lt"
LEFT JOIN
     (SELECT * 
     FROM (VALUES (2,'bar')) "unused" (col_right_1, col_right_2)
) AS "rt"
ON lt.col_left_2 = rt.col_right_2
WHERE lt.col_left_1 >= COALESCE(rt.col_right_1, 0)

379 and before:

col_left_1    col_left_2    col_right_1     col_right_2
1             foo           None            None

380 and after:

<no rows>
alec-heif commented 2 years ago

notably, removing the WHERE at the end (or even changing it to something more trivial like WHERE lt. col_left_1 IS NOT NULL) does not reproduce the problem)

raunaqmorarka commented 2 years ago

fyi @sopel39 @skrzypo987 @martint

sopel39 commented 2 years ago

@raunaqmorarka do you have suspect what could cause it?

sopel39 commented 2 years ago

@alec-heif please share full explain

sopel39 commented 2 years ago

Suspect is https://github.com/trinodb/trino/pull/12236. If rule works correctly, then it might be problem with ineq join implementation:

Additionally, optimized execution of inequality join is performed when equi conditions are also present.

which would mean pre-existing issue

alec-heif commented 2 years ago

EXPLAIN v379:

Fragment 0 [SINGLE]
    Output layout: [field, field_0, field_1, field_2]
    Output partitioning: SINGLE []
    Stage Execution Strategy: UNGROUPED_EXECUTION
    Output[col_left_1, col_left_2, col_right_1, col_right_2]
    │   Layout: [field:integer, field_0:varchar(3), field_1:integer, field_2:varchar(3)]
    │   Estimates: {rows: 1 (111B), cpu: 11.48k, memory: 5.39kB, network: 0B}
    │   col_left_1 := field
    │   col_left_2 := field_0
    │   col_right_1 := field_1
    │   col_right_2 := field_2
    └─ Filter[filterPredicate = (""field"" >= COALESCE(""field_1"", 0))]
       │   Layout: [field:integer, field_0:varchar(3), field_1:integer, field_2:varchar(3)]
       │   Estimates: {rows: 1 (111B), cpu: 11.48k, memory: 5.39kB, network: 0B}
       └─ LeftJoin[(""field_0"" = ""field_2"")][$hashvalue, $hashvalue_4]
          │   Layout: [field:integer, field_0:varchar(3), field_1:integer, field_2:varchar(3)]
          │   Estimates: {rows: 2 (153B), cpu: 11.33k, memory: 5.39kB, network: 0B}
          │   Distribution: REPLICATED
          ├─ LocalExchange[ROUND_ROBIN] ()
          │  │   Layout: [field:integer, field_0:varchar(3), $hashvalue:bigint]
          │  │   Estimates: {rows: 1 (69B), cpu: 138, memory: 0B, network: 0B}
          │  └─ Project[]
          │     │   Layout: [field:integer, field_0:varchar(3), $hashvalue_3:bigint]
          │     │   Estimates: {rows: 1 (69B), cpu: 69, memory: 0B, network: 0B}
          │     │   $hashvalue_3 := combine_hash(bigint '0', COALESCE(""$operator$hash_code""(""field_0""), 0))
          │     └─ Values
          │            Layout: [field:integer, field_0:varchar(3)]
          │            Estimates: {rows: 1 (60B), cpu: 0, memory: 0B, network: 0B}
          │            (1, 'foo')
          └─ LocalExchange[HASH][$hashvalue_4] (""field_2"")
             │   Layout: [field_1:integer, field_2:varchar(3), $hashvalue_4:bigint]
             │   Estimates: {rows: 1 (69B), cpu: 276, memory: 0B, network: 0B}
             └─ Filter[filterPredicate = (""field_2"" = 'foo')]
                │   Layout: [field_1:integer, field_2:varchar(3), $hashvalue_5:bigint]
                │   Estimates: {rows: 1 (69B), cpu: 207, memory: 0B, network: 0B}
                └─ LocalExchange[ROUND_ROBIN] ()
                   │   Layout: [field_1:integer, field_2:varchar(3), $hashvalue_5:bigint]
                   │   Estimates: {rows: 1 (69B), cpu: 138, memory: 0B, network: 0B}
                   └─ Project[]
                      │   Layout: [field_1:integer, field_2:varchar(3), $hashvalue_6:bigint]
                      │   Estimates: {rows: 1 (69B), cpu: 69, memory: 0B, network: 0B}
                      │   $hashvalue_6 := combine_hash(bigint '0', COALESCE(""$operator$hash_code""(""field_2""), 0))
                      └─ Values
                             Layout: [field_1:integer, field_2:varchar(3)]
                             Estimates: {rows: 1 (60B), cpu: 0, memory: 0B, network: 0B}
                             (2, 'bar')

EXPLAIN v380:

Fragment 0 [SINGLE]
    Output layout: [field, field_0, field_1, field_2]
    Output partitioning: SINGLE []
    Output[col_left_1, col_left_2, col_right_1, col_right_2]
    │   Layout: [field:integer, field_0:varchar(3), field_1:integer, field_2:varchar(3)]
    │   Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
    │   col_left_1 := field
    │   col_left_2 := field_0
    │   col_right_1 := field_1
    │   col_right_2 := field_2
    └─ CrossJoin
       │   Layout: [field:integer, field_0:varchar(3), field_1:integer, field_2:varchar(3)]
       │   Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
       │   Distribution: REPLICATED
       ├─ LocalExchange[ROUND_ROBIN] ()
       │  │   Layout: [field:integer, field_0:varchar(3)]
       │  │   Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
       │  └─ Values
       │         Layout: [field:integer, field_0:varchar(3)]
       │         Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
       └─ Values
              Layout: [field_1:integer, field_2:varchar(3)]
              Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}
martint commented 2 years ago

The bug is in the rule introduced in #12236.

It's incorrect to push the COALESCE(r.col_right_1, 0) expression below the join, since r.col_right_1 might become null as a result of LEFT JOIN semantics.

alec-heif commented 2 years ago

thanks for the super speedy fix! i just tried out 389 and it seems to work, so think we're all good here.