Open findepi opened 2 days ago
allow creation of schemas for duplicated names
I think it's a good idea, as it allows the schema of the top plan to include duplicate names, thereby resolving #6543.
We can delay the name ambiguity check until a real column reference occurs. But currently, it seems that this check is not sufficient. For example
DataFusion CLI v43.0.0
> select t.a from (select 1 as a, 2 as a) t;
+---+
| a |
+---+
| 1 |
+---+
1 row(s) fetched.
This query did not return an error like it does in PostgreSQL and before. Perhaps we should improve ambiguity check when searching for field names from schemas after removing check_names
.
We can delay the name ambiguity check until a real column reference occurs. But currently, it seems that this check is not sufficient. For example
DataFusion CLI v43.0.0 > select t.a from (select 1 as a, 2 as a) t;
Good catch. This is easy to solve.
The less easy part is that
select * from (select 1 as a, 2 as a) t;
should work. However, the *
gets expanded to Expr expressions and these expressions have no way to differentiate between the two columns from a
. This is because schema is used for both initial query analysis as well as in logical plans. Relates to https://github.com/apache/datafusion/issues/1468.
The less easy part is that
select * from (select 1 as a, 2 as a) t;
should work. However, the
*
gets expanded to Expr expressions and these expressions have no way to differentiate between the two columns froma
. This is because schema is used for both initial query analysis as well as in logical plans. Relates to #1468.
We might need to introduce column index to differentiate them. Since this case was not previously supported either, maybe we can handle it later.
Which issue does this PR close?
Rationale for this change
In SQL, selecting single column multiple times is legal and most modern databases support this. This commit adds such support to DataFusion too.
What changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?
yes, more valid queries are supported