Closed cgpoh closed 1 month ago
I also encountered this exception,did you find the reason? I also want to get purpose of MAX_CONTINUOUS_EMPTY_COMMITS .
The Flink snapshot/checpoint state is kept in 3 places:
These 3 needs to be in sync, and we need to keep the changes since the last sync. So if we do not commit to Iceberg, then the Flink internal state and the file system temp tables are keep growing. To avoid this, we commit from time to time (and write Flink metadata to the Iceberg table in the process), and after this commit we are able to remove old temp files and clean some data from the Flink state.
So writing empty commit is intentional/needed, but the failure seems like a bug.
@pvary Thanks for your reply.
@dou-dou: Could you please share the Flink+Iceberg version you are using?
@pvary Version Flink 1.12.7 Iceberg 0.13.1 Question I am using chunjun to incrementally sync data from mysql to iceberg. I noticed that the Flink job is running in overwrite mode. However,when there is no incremental data, that is totalFiles == 0, and the number of checkpoint submissions is a multiple of MAX_CONTINUOUS_EMPTY_COMMITS, that is continuousEmptyCheckpoints % maxContinuousEmptyCommits == 0 ,it throws the following exception: then, the Flink job is restarting and running.Next ,it will throwing the exception when checkpointId is 21(The last exception thrown when checkpointId is 12). I hope that there is totalFiles == 0 , and no exception will be generated when executing operation.commit(), or is there another solution.
Seems like a bug to me. When we are writing data to the IcebergSink, and there is no data come in for the specified maxContinuousEmptyCommits
number of commits, then we try to commit an empty changeset with the following code in lcebergFilesCommitter
:
if (totalFiles != 0 || continuousEmptyCheckpoints % maxContinuousEmptyCommits == 0) {
if (replacePartitions) {
replacePartitions(pendingResults, summary, newFlinkJobId, operatorId, checkpointId);
} else {
commitDeltaTxn(pendingResults, summary, newFlinkJobId, operatorId, checkpointId);
}
continuousEmptyCheckpoints = 0;
} else {
LOG.info("Skip commit for checkpoint {} due to no data files or delete files.", checkpointId);
}
I think we might want to handle the zero commits with a different code path. Maybe we just switch this case to handled by commitDeltaTxn
regardless of the replacePartitions
value.
I do not have enough time nowadays to write and test the patch, but if you have time to write it, I will try to find time to review it.
CC: @stevenzwu, @hililiwei
@pvary Thanks for your advice.
@pvary Yes, seems like a bug. But do you try to overwrite the data of a partition every time @cgpoh ? Is it reasonable to overwrite partitions or the table every time in a streaming task? It seems like we only do this in batch tasks.
If commit an empty snapshot using overwrite
mode, it will occur this error. I raised #7983 , which follows @pvary ’s suggestion, regardless of the replacePartitions value.
@dou-dou: What is the Iceberg Sink configuration you are using?
@pvary I have found that the exception is caused by my unreasonable configuration.Thank you very much for your help.
Still, as @hililiwei mentioned in her other comment, it might be good to prevent these "unreasonable" configurations, if they are indeed unreasonable 😄
Yes, if we were not supposed to use overwriter
in streaming tasks, then should we add a check to avoid this misuse? @pvary
@pvary I have found that the exception is caused by my unreasonable configuration.Thank you very much for your help.
Hi @dou-dou , what’s your unreasonable configuration? Like to check whether my configuration is unreasonable too.
@cgpoh as hililiwei mentioned in her other comment,I'm probably using override mode in the stream task when I use chunjun to incrementally sync data from mysql to iceberg.
Thanks @dou-dou for your reply! My replacePartitions flag is set to true in my streaming task and hence throwing exceptions with empty commit.
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'
Query engine
Flink
Question
I have a Flink job that uses side output to write to Iceberg table when there are errors in the main processing function. If there are no errors in the processing function, no data files will be added to be committed. I noticed that the Flink job is restarting and throwing the following exception:
I saw that in the
commitPendingResult
function of IcebergFilesCommitter.java, there's a condition to check whether to skip empty commit but if the MAX_CONTINUOUS_EMPTY_COMMITS is met, it will proceed to commit even there are no data files to commit and thus, throwing the above exception.May I know what's the purpose of this empty commit?