Teradata / presto

Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
http://www.teradata.com/presto
Apache License 2.0
94 stars 21 forks source link

Clustered Hive tables support???? #716

Open zzmg opened 7 years ago

zzmg commented 7 years ago

I installed presto_server_pkg.0.167-t.0.2 on the cento7.2. According to the document installed presto-admin,and the coordinator and workers are on a machine。

  1. Data Sources flume-->hive hive table:create table test (bytes_in int,bytes_out int,device_id string,device_type string,host string,latency int,level string,method string,msg string,path string,referer string,remote_ip string,response_code int,route string,status int,type string,uri string,user_agent string,user_id bigint,time string) PARTITIONED BY(year string,month string,day string) clustered by (user_id) into 5 buckets stored as orc TBLPROPERTIES ("transactional"="true");

  2. hdfs path /user/hive/warehouse/logs.db/test/year=2017/month=10/day=26/delta_0018401_0018500/bucket_0000 According to the document,I have set hive.multi-file-bucketing.enabled config property in the hive.properties and set session property. presto:logs> set session hive.multi_file_bucketing_enabled = true; SET SESSION but,through the use of presto query error: presto:logs> select count() from test; Query 20171026_065023_00034_4pi6j failed: Hive table is corrupt. It is declared as being bucketed, but the files do not match the bucketing declaration. Found sub-directory in bucket directory for partition: year=2017/month=10/day=26

According to the document, https://teradata.github.io/presto/docs/current/release/release-0.167-t.html Fix issue “Hive table is corrupt. It is declared as being bucketed, but the files do not match the bucketing declaration. The number of files in the directory (1) does not match the declared.” by fixing support for Hive bucketed tables. See option hive.multi-file-bucketing.enabled in the Presto Hive connector documentation.

but,,,,,,, who can help me????????