apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.36k stars 1.2k forks source link

`JOIN` should require `ON` condition #13486

Open findepi opened 5 days ago

findepi commented 5 days ago

Describe the bug

SELECT ... FROM l JOIN r is not a valid SQL query, the ON condition should not be optional

To Reproduce

DataFusion CLI v43.0.0
> SELECT * FROM (SELECT 1 AS id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r;
+----+------+----+------+
| id | name | id | name |
+----+------+----+------+
| 1  | Foo  | 1  | Bar  |
+----+------+----+------+
1 row(s) fetched.

Expected behavior

The query should be rejected, the ON condition be requiered. In fact, omitting it might be typo with dangerous consequences (expensive cross join).

Trino

trino> SELECT * FROM (SELECT 1 AS id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r;
Query 20241119_124051_00000_rykay failed: line 1:87: mismatched input '<EOF>'. Expecting: 'ON', 'USING'
SELECT * FROM (SELECT 1 AS id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r <EOF>

PostgresSQL

postgres=# SELECT * FROM (SELECT 1 AS id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r;
ERROR:  syntax error at or near ";"
LINE 1: ...id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r;

DuckDB

D SELECT * FROM (SELECT 1 AS id, 'Foo' AS name) l JOIN (SELECT 1 AS id, 'Bar' as name) r;
Parser Error: syntax error at or near ";"
LINE 1: ... JOIN (SELECT 1 AS id, 'Bar' as name) r;
                                                  ^

Additional context

No response

Dandandan commented 5 days ago

The condition is also required in the SQL specification here (for qualified joins). https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#join-specification so I think this should be changed / fixed in sqlparser-rs.

findepi commented 4 days ago

Thanks @Dandandan for checking the SQL standard too!

demetribu commented 8 hours ago

take