linkedin / coral

Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
BSD 2-Clause "Simplified" License
794 stars 186 forks source link

Incorrect schema for views with unnamed columns in the view SQL #212

Open shardulm94 opened 2 years ago

shardulm94 commented 2 years ago

If we create a Hive view with columns that are not explicitly named in the SQL, Hive with autogenerate column names for such columns persisting the view to the Metastore. E.g. Consider the following view

hive> CREATE VIEW v AS SELECT TRUE, lower('STR') FROM some_table;

Here is the persisted information in the metastore

hive> DESC FORMATTED v;
OK
# col_name              data_type               comment

_c0                     boolean
_c1                     string
.
.
.
# View Information
View Original Text:     SELECT TRUE, lower('STR') FROM some_table
View Expanded Text:     SELECT TRUE, lower('STR') FROM `some_table`

Trying to retrieve view schema using coral-schema for this view returns EXPR_0 and EXPR_1 as column names, which I assume are auto generated by Calcite. I think we should respect the column names provided by Hive here i.e. _c0 and c1 since they have been persisted as column metadata in the metastore.

ljfgem commented 2 years ago

Thanks @shardulm94 for reporting it, opened a PR #214 to fix it.