[X] I searched in the issues and found nothing similar.
Paimon version
0.8
Compute Engine
flink 1.17.1
Minimal reproduce step
java.lang.RuntimeException: Failed to find latest snapshot id
at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:143)
at org.apache.paimon.utils.SnapshotManager.latestSnapshot(SnapshotManager.java:131)
at org.apache.paimon.operation.FileStoreCommitImpl.tryCommit(FileStoreCommitImpl.java:623)
at org.apache.paimon.operation.FileStoreCommitImpl.commit(FileStoreCommitImpl.java:294)
at org.apache.paimon.table.sink.TableCommitImpl.commitMultiple(TableCommitImpl.java:192)
at org.apache.paimon.flink.sink.StoreCommitter.commit(StoreCommitter.java:100)
at org.apache.paimon.flink.sink.CommitterOperator.commitUpToCheckpoint(CommitterOperator.java:159)
at org.apache.paimon.flink.sink.CommitterOperator.notifyCheckpointComplete(CommitterOperator.java:152)
at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:104)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.notifyCheckpointComplete(RegularOperatorChain.java:145)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpoint(SubtaskCheckpointCoordinatorImpl.java:467)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:400)
at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1430)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$16(StreamTask.java:1371)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$19(StreamTask.java:1410)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:494)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1737)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1753)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1750)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1765)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1760)
at org.apache.paimon.fs.hadoop.HadoopFileIO.exists(HadoopFileIO.java:110)
at org.apache.paimon.utils.SnapshotManager.findLatest(SnapshotManager.java:391)
at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:141)
... 27 more
What doesn't meet your expectations?
Consuming Kafka data and writing it to HDFS files via Flink SQL. Occasionally, for some high-traffic topics, this error occurs. Please provide guidance on how to troubleshoot and resolve this issue.
@skdfeitian From the error stack,this issue due to FileSystem cache expired. Maybe you can set fs.hdfs.impl.disable.cache=true in client or hdfs conf to reslove the problem.
Search before asking
Paimon version
0.8
Compute Engine
flink 1.17.1
Minimal reproduce step
java.lang.RuntimeException: Failed to find latest snapshot id at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:143) at org.apache.paimon.utils.SnapshotManager.latestSnapshot(SnapshotManager.java:131) at org.apache.paimon.operation.FileStoreCommitImpl.tryCommit(FileStoreCommitImpl.java:623) at org.apache.paimon.operation.FileStoreCommitImpl.commit(FileStoreCommitImpl.java:294) at org.apache.paimon.table.sink.TableCommitImpl.commitMultiple(TableCommitImpl.java:192) at org.apache.paimon.flink.sink.StoreCommitter.commit(StoreCommitter.java:100) at org.apache.paimon.flink.sink.CommitterOperator.commitUpToCheckpoint(CommitterOperator.java:159) at org.apache.paimon.flink.sink.CommitterOperator.notifyCheckpointComplete(CommitterOperator.java:152) at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:104) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.notifyCheckpointComplete(RegularOperatorChain.java:145) at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpoint(SubtaskCheckpointCoordinatorImpl.java:467) at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:400) at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1430) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$16(StreamTask.java:1371) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$19(StreamTask.java:1410) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:494) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1737) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1753) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1750) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1765) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1760) at org.apache.paimon.fs.hadoop.HadoopFileIO.exists(HadoopFileIO.java:110) at org.apache.paimon.utils.SnapshotManager.findLatest(SnapshotManager.java:391) at org.apache.paimon.utils.SnapshotManager.latestSnapshotId(SnapshotManager.java:141) ... 27 more
What doesn't meet your expectations?
Consuming Kafka data and writing it to HDFS files via Flink SQL. Occasionally, for some high-traffic topics, this error occurs. Please provide guidance on how to troubleshoot and resolve this issue.
Anything else?
No response
Are you willing to submit a PR?