Open fenil25 opened 1 year ago
Sorry for the issue @fenil25 , the release 0.13.0 is a very buggy release, I'm wondering if you can try rellease 0.12.3 or 0.13.1 instead.
@fenil25 Were you able to try with the 0.12.3 or 0.13.1. Did you still faced this issue?
Tips before filing an issue
Have you gone through our FAQs? Yes
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
Our pipeline involves ingesting the changelog from Kafka into Flink and then we finally use the Hudi sink. We are observing a lot of issues with few tables. The table emits around 10K upserts per second (a huge number of updates). We are using EMR 6.11.0 with Hudi version 13.0 and Flink version 1.16.0 Initially, we tried Copy On Write (COW) but then saw lot of issues with checkpointing in Flink. The main culprit was the error -
Checkpoint expired before completing
. Increasing resources, decreasing checkpoint interval, increasing checkpoint timeout nothing helped. We then moved to Merge On Read table. We were still seeing issues like -java.lang.IllegalStateException: Receive an unexpected event for instant 20230912181658265 from task 7
Open hudi issue like this suggest that it's multiple writer issue. We set the writers to 1 to resolve this and then increasing checkpoint timeout from 15 minutes to 60 minutes helped. The checkpoint was taking around 30 minutes. However, the main problem we faced here was with whatever config we change for the Flink pipeline, we had to rebootstrap the table. This was not the case with CoW table. Bootstrapping is expensive for us and takes quite some time. If we do not bootstrap, then we see the error of IllegalStateException again or FileAlreadyExistsException (for log files) Our main questions are -To Reproduce
Steps to reproduce the behavior:
Environment Description
Hudi version : 13.0
Flink version : 1.16.0
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : no
Stacktrace
Stack trace of the aforementioned errors -
FileAlreadyExistsException
IllegalStateException