apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.48k stars 2.43k forks source link

[SUPPORT] Mismatched write token for parquet files #8516

Open xccui opened 1 year ago

xccui commented 1 year ago

Use a Flink streaming job to write MoR tables. The compaction of a series of table was blocked by the following exception. It seems that the parquet file name in the compaction plan differs from the actual file name in terms of the write token part.

The actual file is 55078b57-488a-4be1-87ac-204548d3ec66_1-5-24_20230420023427524.parquet.

2023-04-20 13:35:10 [pool-31-thread-1] ERROR org.apache.hudi.sink.compact.CompactOperator                 [] - Executor executes action [Execute compaction for instant 20230420041145422 from task 1] error
org.apache.hudi.exception.HoodieIOException: Failed to read footer for parquet s3a://path-to-table/dt=2023-01-20/hr=19/55078b57-488a-4be1-87ac-204548d3ec66_1-5-23_20230420023427524.parquet
    at org.apache.hudi.common.util.ParquetUtils.readMetadata(ParquetUtils.java:95) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.common.util.ParquetUtils.readSchema(ParquetUtils.java:208) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.common.util.ParquetUtils.readAvroSchema(ParquetUtils.java:230) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.io.storage.HoodieAvroParquetReader.getSchema(HoodieAvroParquetReader.java:104) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.action.commit.HoodieMergeHelper.runMerge(HoodieMergeHelper.java:91) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.handleUpdateInternal(HoodieFlinkCopyOnWriteTable.java:374) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.handleUpdate(HoodieFlinkCopyOnWriteTable.java:365) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.action.compact.CompactionExecutionHelper.writeFileAndGetWriteStats(CompactionExecutionHelper.java:64) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:231) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:144) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.sink.compact.CompactOperator.doCompaction(CompactOperator.java:133) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.sink.compact.CompactOperator.lambda$processElement$0(CompactOperator.java:116) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
    at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.io.FileNotFoundException: No such file or directory: s3a://path-to-table/dt=2023-01-20/hr=19/55078b57-488a-4be1-87ac-204548d3ec66_1-5-23_20230420023427524.parquet
    at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3866) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3688) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$24(S3AFileSystem.java:3556) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3554) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at promoted.ai.org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at promoted.ai.org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:469) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at promoted.ai.org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:454) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    at org.apache.hudi.common.util.ParquetUtils.readMetadata(ParquetUtils.java:93) ~[blob_p-abdf98cc6fdb80521c5886e97d0250884f55321b-e6c0beee736c7301690a2ba078cc0a0f:?]
    ... 15 more

Environment Description

Additional context

The job had metadata enabled first. I disabled the metadata table when restarting the job from a checkpoint.

danny0405 commented 1 year ago

Guess there are some inconsistency while enabling the MDT during the generation of compaction plan.

nsivabalan commented 1 year ago

which version of hudi are you using?

xccui commented 1 year ago

which version of hudi are you using?

I built a snapshot version based on the commit in the description.