trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.36k stars 2.98k forks source link

Failure in reading decimal and timestamp from PARQUET (written by Trino) on hive with `experimental_parquet_optimized_writer_enabled`set to `true` #10486

Closed findinpath closed 2 years ago

findinpath commented 2 years ago

It seems that there is an issue in the trino-parquet module for writing the data.

Failing build:

https://github.com/trinodb/trino/runs/4723741729?check_suite_focus=true

Failing test:

tests               | io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://hadoop-master:9000/user/hive/warehouse/storage_formats_compatibility_data_types_parquet/20220106_061658_00036_k9xgj_7793c167-20fe-4d65-9f90-66ca00a22244
tests               |   at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
tests               |   at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
tests               |   at io.trino.tests.product.hive.TestHiveStorageFormats.testInsertAllSupportedDataTypesWithTrino(TestHiveStorageFormats.java:632)
tests               |   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
tests               |   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
tests               |   at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
tests               |   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
tests               |   at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
tests               |   at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
tests               |   at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
tests               |   at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
tests               |   at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
tests               |   at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
tests               |   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
tests               |   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
tests               |   at java.base/java.lang.Thread.run(Thread.java:829)
tests               | Caused by: java.sql.SQLException: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://hadoop-master:9000/user/hive/warehouse/storage_formats_compatibility_data_types_parquet/20220106_061658_00036_k9xgj_7793c167-20fe-4d65-9f90-66ca00a22244
tests               |   at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121)
tests               |   at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109)
tests               |   at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:330)
tests               |   at io.trino.tempto.query.QueryResult$QueryResultBuilder.addRows(QueryResult.java:243)
tests               |   at io.trino.tempto.query.QueryResult.forResultSet(QueryResult.java:192)
tests               |   at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:129)
tests               |   at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:112)
tests               |   ... 15 more
tests               |   Suppressed: java.lang.Exception: Query: SELECT c_boolean, c_tinyint, c_smallint, c_int, c_bigint, c_real, c_double, c_decimal_10_0, c_decimal_10_2, c_decimal_38_5, c_char, c_varchar, c_string, c_binary, c_date, c_timestamp FROM storage_formats_compatibility_data_types_parquet
tests               |       at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:136)
tests               |       ... 16 more
Collapse

Reproduction scenario:

testing/bin/ptl env up --environment 'singlenode'  --config 'config-hdp3'

Run the test io.trino.tests.product.hive.TestHiveStorageFormats#testInsertAllSupportedDataTypesWithTrino

Executing the query on hive:

SELECT c_decimal_10_0  FROM storage_formats_compatibility_data_types_parquet

is causing the issue.

However, executing the query

SELECT c_decimal_38_5  FROM storage_formats_compatibility_data_types_parquet

works just fine.

The data is written in the test with Trino and read with Hive. It seems that there is a regression issue in the trino-parquet module for writing the data.

Selecting the c_timestamp field on hive also throw the exception:

 java.sql.SQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritableV2
    at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
    at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
    at io.trino.tests.product.hive.TestHiveStorageFormats.testInsertAllSupportedDataTypesWithTrino(TestHiveStorageFormats.java:632)
findinpath commented 2 years ago

@alexjo2144 , @findepi FYI

hashhar commented 2 years ago

See https://github.com/trinodb/trino/issues/7953 (and the linked issue and PR).

EDIT: Seems unrelated since an existing test fails.

hashhar commented 2 years ago

Marking as a release blocker since it's a regression and we'll end up disabling the test to unblock master while a fix is being made - https://github.com/trinodb/trino/pull/10487

martint commented 2 years ago

If this is just an issue in the experimental writer, I might not block the release on it. Especially since we need to release soon to address a more serious issue with how log rotation is being performed.

hashhar commented 2 years ago

I'm fine either way since IIUC this is a pre-existing bug that was uncovered by new tests. @findinpath can correct me if my understanding is wrong.

findinpath commented 2 years ago

The PR containing the test code used to discover this issue:

10309

The issue of reading from hdp3 hive incorrectly data written with parquet must be an older issue. I just realized that I didn’t discover it while writing the tests due to fact that I was using onTrino() and TestHiveStorageFormat was setting the session properties via defaultQueryExecutor - reason why the session property hive.experimental_parquet_optimized_writer_enabled was not actually set in the trino session and therefore the test ran successfully. So, this is indeed no regression issue, but rather an issue that existed before writing the test.

findinpath commented 2 years ago

Decoding of decimal values in hive

Caused by: java.sql.SQLException: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://hadoop-master:9000/user/hive/warehouse/stora....

hadoop 3 is based on parquet-1.10.x and hadoop 2 is based on parquet 1.8.1 As of parquet-1.10.x the reading of decimals in parquet-mr library is done differently. See https://issues.apache.org/jira/browse/PARQUET-884 and https://github.com/apache/parquet-mr/pull/404 for details.

Possible fix: Instead of writing integer or long for small decimals, write always FIXED_LEN_BYTE_ARRAY . This approach has the downside of using more space than needed, but it is consistent with both hadoop 2 & hadoop 3 hive readers.

The experimental Parquet writer follows however the Parquet specification

https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

DECIMAL can be used to annotate the following types:

int32: for 1 <= precision <= 9
int64: for 1 <= precision <= 18; precision < 10 will produce a warning
fixed_len_byte_array: precision is limited by the array size. Length n can store <= floor(log_10(2^(8*n - 1) - 1)) base-10 digits
binary: precision is not limited, but is required. The minimum number of bytes to store the unscaled value should be used.
hashhar commented 2 years ago

IIRC Spark has issues with the byte array encoding. See https://stackoverflow.com/questions/63578928/spark-unable-to-read-decimal-columns-in-parquet-files-written-by-avroparquetwrit and https://github.com/sksamuel/avro4s/issues/271

findepi commented 2 years ago

Closing this in favor of the older issue on the topic: https://github.com/trinodb/trino/issues/6377