trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.37k stars 2.98k forks source link

There is no way to find any data when hive table's partition key type is integer #6598

Open allendang001 opened 3 years ago

allendang001 commented 3 years ago

The problem is that we use presto to analyze a hive table whose partition key type is integer, with 'hive.metastore.thrift.assume-canonical-partition-keys' default false. Presto reported that the partitions did not exist, but in fact, we can use the "show partitions" statement to view the partition.After change this parameter to true, the data of this partition can be queried normally.

findepi commented 3 years ago

Can you please provide the steps to reproduce, i.e. what commands should one execute in hive and in Trino to reproduce the problem?

allendang001 commented 3 years ago

Can you please provide the steps to reproduce, i.e. what commands should one execute in hive and in Trino to reproduce the problem?

hive statement:

CREATE EXTERNAL TABLE `test_partition`(                         
   `name` string,                                                  
   `salary` int)                                                   
 PARTITIONED BY (                                                  
   `age` int)                                                      
 ROW FORMAT SERDE                                                  
   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'   
 STORED AS INPUTFORMAT                                             
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
 OUTPUTFORMAT                                                      
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
 LOCATION                                                          
   'file://tmp/test'                         
 TBLPROPERTIES (                                                   
   'transient_lastDdlTime'='1608723685587');
insert into test_partition values('test',400,200);
  1. with 'hive.metastore.thrift.assume-canonical-partition-keys' default false

trino statement: select * from test_partition where age = 200;

exception : Table 'aa.test_partition' was dropped by another transaction

  1. with 'hive.metastore.thrift.assume-canonical-partition-keys' default true

trino statement: select * from test_partition where age = 200;

no exception

findepi commented 3 years ago

@allendang001 for completeness, which hive version are you using? don't you need some SET statement in hive to enable dynamic partition insert?

allendang001 commented 3 years ago

@allendang001 for completeness, which hive version are you using? don't you need some SET statement in hive to enable dynamic partition insert?

hive version : 2.3.5 hive.exec.dynamic.partition = true