Open findinpath opened 12 months ago
@mosabua it would probably be helpful to provide an overview of the existing coercions within the Hive connector
I stumbled by chance on https://docs.dremio.com/current/reference/sql/data-types/coercions/
We recently updated the schema evolution docs after @dain found a bunch missing .. we now have at table at https://trino.io/docs/current/connector/hive.html#schema-evolution
If there is more info to add it would be great to get a PR ...
The list which is supported in Hive
/**
* (Rules from Hive's PrimitiveObjectInspectorUtils conversion)
*
* To BOOLEAN, BYTE, SHORT, INT, LONG:
* Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) with down cast if necessary.
* Convert from (FLOAT, DOUBLE) using type cast to long and down cast if necessary.
* Convert from DECIMAL from longValue and down cast if necessary.
* Convert from STRING using LazyLong.parseLong and down cast if necessary.
* Convert from (CHAR, VARCHAR) from Integer.parseLong and down cast if necessary.
* Convert from TIMESTAMP using timestamp getSeconds and down cast if necessary.
*
* AnyIntegerFromAnyIntegerTreeReader (written)
* AnyIntegerFromFloatTreeReader (written)
* AnyIntegerFromDoubleTreeReader (written)
* AnyIntegerFromDecimalTreeReader (written)
* AnyIntegerFromStringGroupTreeReader (written)
* AnyIntegerFromTimestampTreeReader (written)
*
* To FLOAT/DOUBLE:
* Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using cast
* Convert from FLOAT using cast
* Convert from DECIMAL using getDouble
* Convert from (STRING, CHAR, VARCHAR) using Double.parseDouble
* Convert from TIMESTAMP using timestamp getDouble
*
* FloatFromAnyIntegerTreeReader (existing)
* FloatFromDoubleTreeReader (written)
* FloatFromDecimalTreeReader (written)
* FloatFromStringGroupTreeReader (written)
*
* DoubleFromAnyIntegerTreeReader (existing)
* DoubleFromFloatTreeReader (existing)
* DoubleFromDecimalTreeReader (written)
* DoubleFromStringGroupTreeReader (written)
*
* To DECIMAL:
* Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using to HiveDecimal.create()
* Convert from (FLOAT, DOUBLE) using to HiveDecimal.create(string value)
* Convert from (STRING, CHAR, VARCHAR) using HiveDecimal.create(string value)
* Convert from TIMESTAMP using HiveDecimal.create(string value of timestamp getDouble)
*
* DecimalFromAnyIntegerTreeReader (existing)
* DecimalFromFloatTreeReader (existing)
* DecimalFromDoubleTreeReader (existing)
* DecimalFromStringGroupTreeReader (written)
*
* To STRING, CHAR, VARCHAR:
* Convert from (BYTE, SHORT, INT, LONG) using to string conversion
* Convert from BOOLEAN using boolean (True/False) conversion
* Convert from (FLOAT, DOUBLE) using to string conversion
* Convert from DECIMAL using HiveDecimal.toString
* Convert from CHAR by stripping pads
* Convert from VARCHAR with value
* Convert from TIMESTAMP using Timestamp.toString
* Convert from DATE using Date.toString
* Convert from BINARY using Text.decode
*
* StringGroupFromAnyIntegerTreeReader (written)
* StringGroupFromBooleanTreeReader (written)
* StringGroupFromFloatTreeReader (written)
* StringGroupFromDoubleTreeReader (written)
* StringGroupFromDecimalTreeReader (written)
*
* String from Char/Varchar conversion
* Char from String/Varchar conversion
* Varchar from String/Char conversion
*
* StringGroupFromTimestampTreeReader (written)
* StringGroupFromDateTreeReader (written)
* StringGroupFromBinaryTreeReader *****
*
* To TIMESTAMP:
* Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using TimestampWritable.longToTimestamp
* Convert from (FLOAT, DOUBLE) using TimestampWritable.doubleToTimestamp
* Convert from DECIMAL using TimestampWritable.decimalToTimestamp
* Convert from (STRING, CHAR, VARCHAR) using string conversion
* Or, from DATE
*
* TimestampFromAnyIntegerTreeReader (written)
* TimestampFromFloatTreeReader (written)
* TimestampFromDoubleTreeReader (written)
* TimestampFromDecimalTreeReader (written)
* TimestampFromStringGroupTreeReader (written)
* TimestampFromDateTreeReader
*
*
* To DATE:
* Convert from (STRING, CHAR, VARCHAR) using string conversion.
* Or, from TIMESTAMP.
*
* DateFromStringGroupTreeReader (written)
* DateFromTimestampTreeReader (written)
*
* To BINARY:
* Convert from (STRING, CHAR, VARCHAR) using getBinaryFromText
*
* BinaryFromStringGroupTreeReader (written)
*
* (Notes from StructConverter)
*
* To STRUCT:
* Input must be data type STRUCT
* minFields = Math.min(numSourceFields, numTargetFields)
* Convert those fields
* Extra targetFields to NULL
*
* (Notes from ListConverter)
*
* To LIST:
* Input must be data type LIST
* Convert elements
*
* (Notes from MapConverter)
*
* To MAP:
* Input must be data type MAP
* Convert keys and values
*
* (Notes from UnionConverter)
*
* To UNION:
* Input must be data type UNION
* Convert value for tag
*/
Are there issues for all of the missing items on that list now?
This issue acts as an :open_umbrella: for the coercions done or which are supposed to get done in Hive to provide an overview for what efforts have been done in supporting coercion scenarios.