prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.08k stars 5.39k forks source link

Exception while trying to insert into partitioned table #9505

Open mcvejic opened 6 years ago

mcvejic commented 6 years ago

When trying to create insert into partitioned table, following error occur from time to time, making inserts unreliable.

com.facebook.presto.spi.PrestoException: Unable to rename from hdfs://host-address:8020/tmp/presto-user/73374885-19bf-4f9b-9d23-6050944815f6/site_id=197/year=2013/month=5
            to hdfs://host-address:8020/user/hive/warehouse/test.db/transaction/site_id=197/year=2013/month=5: target directory already exists
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.renameDirectory(SemiTransactionalHiveMetastore.java:1544)
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.access$2500(SemiTransactionalHiveMetastore.java:77)
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.prepareAddPartition(SemiTransactionalHiveMetastore.java:980)
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.access$700(SemiTransactionalHiveMetastore.java:815)
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commitShared(SemiTransactionalHiveMetastore.java:745)
    at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:667)
    at com.facebook.presto.hive.HiveMetadata.commit(HiveMetadata.java:1406)
    at com.facebook.presto.hive.HiveConnector.commit(HiveConnector.java:177)
    at com.facebook.presto.transaction.TransactionManager$TransactionMetadata$ConnectorTransactionMetadata.commit(TransactionManager.java:573)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
    at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
ebyhr commented 6 years ago

@mcvejic Checking this issue now but can't reproduce. Could you share the DDL and INSERT script?

hive> show create table t9595;
OK
CREATE TABLE `t9595`(
  `c1` int)
PARTITIONED BY ( 
  `p1` varchar(8), 
  `p2` varchar(8))
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
presto:default> insert into hive.default.t9595 select 1, 1, 1;
INSERT: 1 row
presto:default> 
presto:default> insert into hive.default.t9595 select 1, 1, 2;
INSERT: 1 row
presto:default> select * from hive.default.t9595;
 c1 | p1 | p2 
----+----+----
  1 |  1 |  1 
  1 |  1 |  2 
(2 rows)
presto:default> show partitions in hive.default.t9595;
 p1 | p2 
----+----
  1 |  1 
  1 |  2 
ordonezf commented 6 years ago

Any news on this? I'm having the same error every now and then.

findepi commented 6 years ago

@ordonezf , please see @ebyhr 's comment above. Could you try to simplify your case and narrow down repro steps for this issue?

ordonezf commented 6 years ago

Sure @findepi, sorry for the delay!

This is what I do:

  1. Drop table A and B, if exists, and create them again in hive
  2. Insert data from Presto into table A
  3. Insert from table A into table B using Presto

This process runs every day and every couple of weeks the insert into table B fails. To fix it I have to enter the hive cli and drop the tables manually.

Here is the error I get:

{'message': 'Unable to rename from s3://path.net/tmp/presto-presto/8917428b-42c2-4042-b9dc-08dd8b9a81bc/ymd=2018-04-08 to s3://path.net/emr/test/B/ymd=2018-04-08: target directory already exists', 'errorCode': 16777231, 'errorName': 'HIVE_PATH_ALREADY_EXISTS', 'errorType': 'EXTERNAL', 'failureInfo': {'type': 'com.facebook.presto.spi.PrestoException', 'message': 'Unable to rename from s3://path.net/tmp/presto-presto/8917428b-42c2-4042-b9dc-08dd8b9a81bc/ymd=2018-04-08 to s3://path.net/emr/test/B/ymd=2018-04-08: target directory already exists', 'suppressed': [], 'stack': ['com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.renameDirectory(SemiTransactionalHiveMetastore.java:1702)', 'com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.access$2700(SemiTransactionalHiveMetastore.java:83)', 'com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.prepareAddPartition(SemiTransactionalHiveMetastore.java:1104)', 'com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.access$700(SemiTransactionalHiveMetastore.java:919)', 'com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commitShared(SemiTransactionalHiveMetastore.java:847)', 'com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:769)', 'com.facebook.presto.hive.HiveMetadata.commit(HiveMetadata.java:1657)', 'com.facebook.presto.hive.HiveConnector.commit(HiveConnector.java:177)', 'com.facebook.presto.transaction.TransactionManager$TransactionMetadata$ConnectorTransactionMetadata.commit(TransactionManager.java:577)', 'java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)', 'com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)', 'com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)', 'com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)', 'io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)', 'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)', 'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)', 'java.lang.Thread.run(Thread.java:748)']}}

And here are the creates and inserts:

CREATE TABLE `B`(
  `id` bigint,
  `type` string,
  `created_at` timestamp,
  `status` string)
PARTITIONED BY (
  `ymd` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
WITH SERDEPROPERTIES (
  'field.delim'='\t',
  'serialization.format'='\t')
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  's3://path.net/emr/test/B'
TBLPROPERTIES (
  'transient_lastDdlTime'='1534133136')

This is a simplified version of the insert script:

insert into hive.test.B
    with A_limited as (
    select
        *
    from hive.test.A
    where ymd >= '2018-01-01'
    )
    select
        *
    from A_limited
mirajgodha commented 4 years ago

@ebyhr Here are the exact steps to reproduce the issue:

hive> show create table t9595;
OK
CREATE TABLE `t9595`(
  `c1` int)
PARTITIONED BY ( 
  `p1` varchar(8), 
  `p2` varchar(8))
 ROW FORMAT SERDE                                   
   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'  
 STORED AS INPUTFORMAT                              
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'  
 OUTPUTFORMAT                                       
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 
  TBLPROPERTIES ('transactional'='false')
presto:default> insert into hive.default.t9595 select 1, 1, 1;
INSERT: 1 row
presto:default> 
presto:default> insert into hive.default.t9595 select 1, 1, 2;
INSERT: 1 row
presto:default> select * from hive.default.t9595;
 c1 | p1 | p2 
----+----+----
  1 |  1 |  1 
  1 |  1 |  2 
(2 rows)
presto:default> show partitions in hive.default.t9595;
 p1 | p2 
----+----
  1 |  1 
  1 |  2 

till now it works fine.. Now follow the below steps again.

hive> drop table t9595;
hive> CREATE external TABLE `t9595`(
  `c1` int)
PARTITIONED BY ( 
  `p1` varchar(8), 
  `p2` varchar(8))
 ROW FORMAT SERDE                                   
   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'  
 STORED AS INPUTFORMAT                              
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'  
 OUTPUTFORMAT                                       
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 
  TBLPROPERTIES ('transactional'='false');
presto:siq_dev> insert into t9595 select 1, '1', '1';
INSERT: 1 row

Query 20200413_091825_00078_7q573, FAILED, 3 nodes
Splits: 69 total, 69 done (100.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20200413_091825_00078_7q573 failed: Unable to rename from hdfs://siqhdp01/tmp/presto-root/e81b61f2-e69a-42e7-ad1b-47781b378554/p1=1/p2=1 to hdfs://siqhdp01/warehouse/tablespace/external/hive/siq_dev.db/t9595/p1=1/p2=1: target directory already exists

That is, if the old table (external table) is deleted and the folder(s) exists in hdfs for the table and table partitions. And when we recreate the table and try to do insert this error comes.

chrismclennon commented 4 years ago

I am also seeing this issue as described by @mirajgodha

esemeniuc commented 3 years ago

I'm also running into this. This seems to explain the problem as a race condition: https://translate.google.com/translate?hl=en&sl=zh-CN&u=https://www.dazhuanlan.com/2020/02/03/5e3759b8799d3/&prev=search&pto=aue