[SUPPORT] Spark3.2 encountered duplicate data while reading the hudi bucket MOR table

Describe the problem you faced A few days ago in the production environment, a datanode in the Hadoop cluster downtimed, it causing the flink streaming write task( for hudi bucket mor table) failed. After restarting the Flink task, when we used Spark3.2 or Presto333 to read data from the table, we found duplicate data under the same primary key ，yet the duplicate records have the same Hudi system field values (_hoodie_commit_time, _hoodie_commit_seqno, _hoodie_filename) .
Note: This Flink write task has been running normally for several days，There were no duplicates record before a datanode downtimed.

6d83cedd6b4e7b3b21c76493f0836927 3dcb86d0cd346c65160acc88edb7d8ee

Environment Description

Hudi version :0.13.0
Spark version :3.2
Hive version :3.1
Hadoop version :3.0
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : no

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

apache / hudi

[SUPPORT] Spark3.2 encountered duplicate data while reading the hudi bucket MOR table #9244