Closed joker1007 closed 4 months ago
Thanks for the feedback, cc @beyond1920 can you take a look?
Thanks for the feedback, it should be fixed in: https://github.com/apache/hudi/pull/11550
@joker1007 Thanks for reporting this bug. The bugfix pr 11550 is merged.
Describe the problem you faced
We used Flink and CONSISTENT_BUCKET to write records.
Set clustering.schedule.enabled=true.
Once the writing process stopped, a scheduled clustering job was executed using spark-submit. The process completed successfully, and a replacecommit file was created.
We then resumed the process in Flink, and encountered a NullPointerException during the next checkpoint process.
The Flink SQL definition is as follows.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
checkpointing completes successfully.
Environment Description
I use Amazon EMR 7.1,0
Hudi version : 0.14.1
Spark version : 3.5.0
Flink version: 1.18.1
Hive version : 3.1.3
Hadoop version : 3.3.6
Storage (HDFS/S3/GCS..) : S3 (use EMRFS)
Running on Docker? (yes/no) : no
Additional context
Add any other context about the problem here.
Stacktrace
I checked the source code and it looks like
ConsistentBucketAssignFunction#snapshotState
is null for some reason.Is there a case where
lastRefreshInstant
is null?