uber / marmaray

Generic Data Ingestion & Dispersal Library for Hadoop
https://eng.uber.com/marmaray-hadoop-ingestion-open-source/
Other
479 stars 111 forks source link

Unable to use hudi sink #22

Open ankitdimania-eligible opened 5 years ago

ankitdimania-eligible commented 5 years ago

I'm trying to use Hudi sink but the write to Hudi-client is not working. Has anyone seen this issue before:

2019-10-08 20:29:46 INFO  HoodieTableConfig:69 - Loading dataset properties from /tmp/hoodie/payment_reports_18/.hoodie/hoodie.properties
2019-10-08 20:29:46 INFO  HoodieTableMetaClient:94 - Finished Loading Table of type COPY_ON_WRITE from /tmp/hoodie/payment_reports_18
2019-10-08 20:29:46 INFO  HoodieTableMetaClient:96 - Loading Active commit timeline for /tmp/hoodie/payment_reports_18
2019-10-08 20:29:46 INFO  HoodieActiveTimeline:77 - Loaded instants java.util.stream.ReferencePipeline$Head@1a850eab
2019-10-08 20:29:46 INFO  HoodieActiveTimeline:212 - Marking instant complete [==>20191008202942__commit__INFLIGHT]
2019-10-08 20:29:46 INFO  HoodieActiveTimeline:377 - Created a new file in meta path: /tmp/hoodie/payment_reports_18/.hoodie/20191008202942.inflight
2019-10-08 20:29:46 ERROR JobDag:188 - Failed in JobDag
com.uber.hoodie.exception.HoodieIOException: Could not rename /tmp/hoodie/payment_reports_18/.hoodie/20191008202942.inflight to /tmp/hoodie/payment_reports_18/.hoodie/20191008202942.commit
        at com.uber.hoodie.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:330)
        at com.uber.hoodie.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:215)
        at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:518)
        at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:491)
        at com.uber.marmaray.common.sinks.hoodie.HoodieSink$HoodieWriteClientWrapper.commit(HoodieSink.java:489)
        at com.uber.marmaray.common.sinks.hoodie.HoodieSink.commit(HoodieSink.java:290)
        at com.uber.marmaray.common.sinks.hoodie.HoodieSink.commit(HoodieSink.java:267)
        at com.uber.marmaray.common.sinks.hoodie.HoodieSink.write(HoodieSink.java:184)
        at com.uber.marmaray.common.sinks.hoodie.HoodieSink.write(HoodieSink.java:161)
        at com.uber.marmaray.common.job.SingleSinkSubDag.executeNode(SingleSinkSubDag.java:51)
        at com.uber.marmaray.common.job.JobSubDag.execute(JobSubDag.java:149)
        at com.uber.marmaray.common.job.JobDag.execute(JobDag.java:171)
        at com.uber.marmaray.common.job.JobManager.lambda$null$0(JobManager.java:207)
        at com.uber.marmaray.common.job.ThreadPoolService$ThreadPoolServiceCallable.call(ThreadPoolService.java:415)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
tooptoop4 commented 4 years ago

@ankitdimania-eligible did u solve? i get similar

2020-02-10 18:04:24,707 [main] INFO com.uber.hoodie.common.table.timeline.HoodieActiveTimeline - Marking instant complete [==>20200210180348commitINFLIGHT] Exception in thread "main" com.uber.hoodie.exception.HoodieIOException: Could not rename s3a://x/.hoodie/20200210180348.inflight to s3a://x/.hoodie/20200210180348.commit at com.uber.hoodie.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:328) at com.uber.hoodie.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:213) at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:529) at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:489) at com.uber.hoodie.HoodieWriteClient.commit(HoodieWriteClient.java:480)

ankitdimania-eligible commented 4 years ago

@tooptoop4 yeah, the above issue is bause of double commits. I had auto commit set to true and then I was mistakenly trying to commit again.