apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.45k stars 2.43k forks source link

[SUPPORT]PrestoDB failed to query data from mor table #8078

Open JoshuaZhuCN opened 1 year ago

JoshuaZhuCN commented 1 year ago

An error occurred when PrestoDB queried a field with TIMESTAMP type in the MOR table [Snapshot Mode]

class org.apache.hadoop.io.LongWritable cannot be cast to class org.apache.hadoop.hive.serde2.io.TimestampWritable (org.apache.hadoop.io.LongWritable and org.apache.hadoop.hive.serde2.io.TimestampWritable are in unnamed module of loader com.facebook.presto.server.PluginClassLoader @648ec9f4)

To Reproduce

Steps to reproduce the behavior:

1.

drop table if exists `default`.`spark_0_12_1_test2`;

2.

CREATE TABLE `default`.`spark_0_12_1_test2` (
     `id` INT
    ,`name` string
    ,`age` INT
    ,`sync_time` TIMESTAMP
) USING HUDI
TBLPROPERTIES (
     `type` = 'mor'
    ,`primaryKey` = 'id'
    ,`preCombineField` = 'sync_time'
    ,`hoodie.bucket.index.hash.field` = 'id'
    ,`hoodie.index.type` = 'BLOOM'
    ,`hoodie.datasource.write.hive_style_partitioning` = 'false'
    ,`hoodie.compaction.payload.class` = 'org.apache.hudi.common.model.OverwriteWithLatestAvroPayload',
    `hoodie.datasource.hive_sync.support_timestamp` = 'true'
)
COMMENT 'test_0.12.1';

3.

INSERT INTO `default`.`spark_0_12_1_test2`(id, name, age, sync_time) SELECT 2 AS id, 'hudi0121' AS name, 22 AS age, NOW() AS sync_time;

update `default`.`spark_0_12_1_test2` set age = 33, sync_time = now() where id = 2;
  1. query with prestodb
    select * from hudi."default".spark_0_12_1_test2;

Environment Description

danny0405 commented 1 year ago

@codope Can you take a look at this issue?

codope commented 1 year ago

This is a known issue. See https://github.com/apache/hudi/issues/7724 and https://github.com/apache/hudi/issues/2869 We have a PR under review https://github.com/apache/hudi/pull/3391

ChandrasekharPo-Kore commented 1 year ago

Any progress on this one. The query works on [table]_ro but fails with this error on [table]_rt This is forcing us to use cow table