Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.39k stars 170 forks source link

SQL: left join is missing output columns from right side #3194

Closed universalmind303 closed 2 weeks ago

universalmind303 commented 3 weeks ago

Describe the bug

the output is missing the column t2.id

To Reproduce

table1 = daft.from_pydict({
    "id": [1, 2, 3, 4],
    "value": ["a", "b", "c", "d"]
})

table2 = daft.from_pydict({
    "id": [2, 3, 4, 5],
    "value": ["b", "c", "d", "e"]
})

daft.sql("""
    SELECT *
    FROM table1 t1
    LEFT JOIN table2  t2 ON t1.id = t2.id;
    """).collect()

output

╭───────┬───────┬──────────╮
│ id    ┆ value ┆ t2.value │
│ ---   ┆ ---   ┆ ---      │
│ Int64 ┆ Utf8  ┆ Utf8     │
╞═══════╪═══════╪══════════╡
│ 1     ┆ a     ┆ None     │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2     ┆ b     ┆ b        │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3     ┆ c     ┆ c        │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 4     ┆ d     ┆ d        │
╰───────┴───────┴──────────╯
(Showing first 4 of 4 rows)

Expected behavior

No response

Component(s)

SQL

Additional context

same query in duckdb

D SELECT *  FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.id;
┌───────┬─────────┬───────┬─────────┐
│  id   │  value  │  id   │  value  │
│ int32 │ varchar │ int32 │ varchar │
├───────┼─────────┼───────┼─────────┤
│     2 │ b       │     2 │ b       │
│     3 │ c       │     3 │ c       │
│     4 │ d       │     4 │ d       │
│     1 │ a       │       │         │
└───────┴─────────┴───────┴─────────┘

same query in polars

shape: (4, 4)
┌─────┬───────┬───────┬──────────┐
│ id  ┆ value ┆ id:t2 ┆ value:t2 │
│ --- ┆ ---   ┆ ---   ┆ ---      │
│ i64 ┆ str   ┆ i64   ┆ str      │
╞═════╪═══════╪═══════╪══════════╡
│ 1   ┆ a     ┆ null  ┆ null     │
│ 2   ┆ b     ┆ 2     ┆ b        │
│ 3   ┆ c     ┆ 3     ┆ c        │
│ 4   ┆ d     ┆ 4     ┆ d        │
└─────┴───────┴───────┴──────────┘
universalmind303 commented 2 weeks ago

closed via #3241