trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Hive and Iceberg reads timestamp differently with Parquet #17785

Open marcinsbd opened 1 year ago

marcinsbd commented 1 year ago

When we insert timestamp to a Hive table with Parquet file format and migrate the table to Iceberg, the values read by Hive and Iceberg differs.

Hive: It occurs that Hive takes the timestamp and adjust it according to the writer's TZ to get UTC value which then is written down. Hive adds also the writer TZ info to the footer. During read, Hive returns the original value. It reads the timestamp in UTC and adjust it according to the writer's TZ.

Trino uses the following property hive.parquet.time-zone for setting Parquet reader's and Parquet writer's timezone when dealing with Hive. https://github.com/trinodb/trino/blob/0b3b661cb275eeec6c57401f0981a74e775f87fd/plugin/trino-hive/src/main/java/io/trino/plugin/hive/parquet/ParquetFileWriterFactory.java#L84

https://github.com/trinodb/trino/blob/0b3b661cb275eeec6c57401f0981a74e775f87fd/plugin/trino-hive/src/main/java/io/trino/plugin/hive/parquet/ParquetPageSourceFactory.java#L147

Iceberg: Trino always sets the UTC timezone for setting Parquet reader's and Parquet writer's timezone. In this way, it always return UTC values.

https://github.com/trinodb/trino/blob/0b3b661cb275eeec6c57401f0981a74e775f87fd/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java#L1010

The following test scenario shows the case(run within TZ UTC+1):

  1. trino> CREATE TABLE hive.tpch.table1 (x timestamp(3)) with (format='PARQUET' ) ; CREATE TABLE

  2. trino> INSERT INTO hive.tpch.table1 VALUES timestamp '2021-01-01 06:12:12.123'; INSERT: 1 row

  3. trino> SELECT * from hive.tpch.table1; 2021-01-01 06:12:12.123

  4. trino> CALL iceberg.system.migrate('tpch', 'table1');

  5. trino> select * from iceberg.tpch.table1; 2021-01-01 05:12:12.123000

The values returned are different by writer's TZ 2021-01-01 06:12:12.123 <> 2021-01-01 05:12:12.123000

electrum commented 1 year ago

The time zone conversion in the Hive writer happens here: https://github.com/trinodb/trino/blob/e82b604b677069c937f78aa2d202ffc604048106/lib/trino-parquet/src/main/java/io/trino/parquet/writer/valuewriter/Int96TimestampValueWriter.java#L91

electrum commented 1 year ago

We should see how the Iceberg implementation handles this and how they Iceberg community thinks this should behave. My thought is that files with non-UTC INT96 timestamps are not compatible with Iceberg, unless they add a new Iceberg metadata property that tells how to convert it on read.