Open zyclove opened 1 year ago
@zyclove Can you post your table configuration please.
@zyclove Can you post your table configuration please.
hudi:ods_smart_gateway_batch->desc ╔═════════════════════════════════════════════════╤══════════════════════════════════════════════════════════════════════╗ ║ Property │ Value ║ ╠═════════════════════════════════════════════════╪══════════════════════════════════════════════════════════════════════╣ ║ basePath │ s3://tuya-big-data-us/hudi/bi/bi_ods/ods_smart_gateway_batch ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ metaPath │ s3://tuya-big-data-us/hudi/bi/bi_ods/ods_smart_gateway_batch/.hoodie ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ fileSystem │ s3 ║ ╟─────────────���───────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.compaction.payload.class │ org.apache.hudi.common.model.DefaultHoodieRecordPayload ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.metadata.partitions │ files ║ ╟──────────────────────��──────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.type │ MERGE_ON_READ ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.precombine.field │ gmt_modified ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.partition.fields │ dt ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.cdc.enabled │ false ║ ╟─────────────────────────────────────────���───────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.archivelog.folder │ archived ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.timeline.layout.version │ 1 ║ ╟─────────────────────────────────────────────────┼��─────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.checksum │ 3204991261 ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.datasource.write.drop.partition.columns │ false ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.timeline.timezone │ LOCAL ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.recordkey.fields │ id ║ ╟─────────────────────────────────────────────────┼───────────────────���──────────────────────────────────────────────────╢ ║ hoodie.table.name │ ods_smart_gateway_batch ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.partition.metafile.use.base.format │ false ║ ╟─────────────────────────────────────────────────┼───────────────────────────────────────���──────────────────────────────╢ ║ hoodie.datasource.write.hive_style_partitioning │ true ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.populate.meta.fields │ true ║ ╟─────────────────────────────────────────────────┼────────────────────────────────────────────────��─────────────────────╢ ║ hoodie.table.keygenerator.class │ org.apache.hudi.keygen.SimpleKeyGenerator ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.base.file.format │ PARQUET ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.database.name │ ║ ╟─────────────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.datasource.write.partitionpath.urlencode │ false ║ ╟────────���────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────╢ ║ hoodie.table.version │ 5 ║ ╚═════════════════════════════════════════════════╧══════════════════════════════════════════════════════════════════════╝
@ad1happy2go Execuse me, could you take a look at this problem? Please help me to find out where is the problem with these configurations?
@zyclove The configuration looks okay. Any more information you can provide to help us triage better, Are you running the cleaner? What operations are you running for which you are facing this error?
Describe the problem you faced hudi timing spark-sql scheduling tasks sometimes appear org.apache.hudi.exception.HoodieIOException: IOException when reading log file.
To Reproduce
Steps to reproduce the behavior:
1.spark-sql hudi task execute once in half an hour.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version :0.12.2
Spark version :aws EMR 3.2.1
Hive version :2.3.9
Hadoop version :3.2.1
Storage (HDFS/S3/GCS..) :s3
Running on Docker? (yes/no) :no
Additional context
Add any other context about the problem here.
Stacktrace
Add the stacktrace of the error.
3/06/09 02:39:16 INFO BlockManagerInfo: Added rdd_7_16 in memory on ip-172-16-13-109.us-west-2.compute.internal:42539 (size: 2.2 KiB, free: 8.4 GiB) 23/06/09 02:39:16 WARN TaskSetManager: Lost task 93.3 in stage 0.0 (TID 110) (ip-172-16-12-181.us-west-2.compute.internal executor 1): org.apache.hudi.exception.HoodieIOException: IOException when reading log file at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:374) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:113) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.23/06/09 02:39:16 ERROR TaskSetManager: Task 93 in stage 0.0 failed 4 times; aborting job 23/06/09 02:39:16 INFO YarnScheduler: Cancelling stage 0 23/06/09 02:39:16 INFO YarnScheduler: Killing all running tasks in stage 0: Stage cancelled 23/06/09 02:39:16 INFO YarnScheduler: Stage 0 was cancelled 23/06/09 02:39:16 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at HoodieJavaRDD.java:127) failed in 43.043 s due to Job aborted due to stage failure: Task 93 in stage 0.0 failed 4 times, most recent failure: Lost task 93.3 in stage 0.0 (TID 110) (ip-172-16-12-181.us-west-2.compute.internal executor 1): org.apache.hudi.exception.HoodieIOException: IOException when reading log file at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:374) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:220) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:209) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:113) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:106)
at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:343)
at org.apache.hudi.LogFileIterator$.scanLog(LogFileIterator.scala:305)
at org.apache.hudi.LogFileIterator.(LogFileIterator.scala:89)
at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:55)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.sql.execution.SQLConfInjectingRDD.compute(SQLConfInjectingRDD.scala:58)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:386)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1440)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1350)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1414)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1237)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.FileNotFoundException: No such file or directory 's3://big-data-us/hudi/bi/bi_ods/ods_smart_gateway_batch/dt=2016/.00000008-ad34-409c-9303-31ae4b4dbb41-0_20230604041023497.log.4_8-12-8134'
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:521)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:932)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:924)
at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:195)
at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:479)
at org.apache.hudi.common.table.log.HoodieLogFileReader.(HoodieLogFileReader.java:115)
at org.apache.hudi.common.table.log.HoodieLogFormatReader.(HoodieLogFormatReader.java:70)
at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:244)
... 50 more