Open ygf11 opened 1 year ago
This is because the cast
expression is ignored in the column name, which is an expected behaviour, I guess.
Yes, cast expression is ignored in the column name.
I think the other reason is our projection does not support duplicated items which seems PostgreSQL and spark both support.
spark-sql> select a,a,a from test0;
a a a
1 1 1
Time taken: 0.134 seconds, Fetched 1 row(s)
> select c0,c0,c0,c0 from test0 limit 3;
c0 | c0 | c0 | c0
-------+-------+-------+-------
| | |
19470 | 19470 | 19470 | 19470
22111 | 22111 | 22111 | 22111
(3 rows)
(1 row)
❯ select c0, c0, c0 from test0;
SchemaError(FieldNotFound { field: Column { relation: None, name: "c0" }, valid_fields: Some([Column { relation: Some("test0"), name: "a" }, Column { relation: Some("test0"), name: "b" }, Column { relation: Some("test0"), name: "c" }]) })
This is because the
cast
expression is ignored in the column name, which is an expected behaviour, I guess.
Why it is the expected behavior ?
I think SparkSQL does not have a consistent behavior. See example below:
SparkSQL 3.x Thrift Server JDBC
CREATE TEMPORARY TABLE t1 (t1_id INT,t1_name String);
select t1_id, cast(t1_id as decimal(10, 0)), t1_id + cast(t1_id as decimal(10, 0)) from t1;
Result:
+-----+-----+-------------------------------------------------------------------+ | t1_id | t1_id | (CAST(t1_id AS DECIMAL(10,0)) + CAST(t1_id AS DECIMAL(10,0))) | +-----+-----+-------------------------------------------------------------------+
+-----+-----+-------------------------------------------------------------------+
Describe the bug When I want to project column and the result of its type coercion, datafusion will throw error:
To Reproduce Run projection with expression and its cast-wrapped expression.
Expected behavior
Additional context