apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.45k stars 2.43k forks source link

[HUDI-8502] Fix the potential data loss issue in PartialUpdateAvroPayload#preCombine #12232

Open usberkeley opened 1 week ago

usberkeley commented 1 week ago

Change Logs

The PartialUpdateAvroPayload#preCombine method conceals potential errors, which could lead to data loss. We should stop catching exceptions internally.

Impact

none

Risk level (write none, low medium or high below)

none

Documentation Update

none

Contributor's checklist

danny0405 commented 1 week ago

what kind of exceptions do we have then?

usberkeley commented 1 week ago

what kind of exceptions do we have then?

@danny0405 An IOException should be directly thrown.

The preCombine of this payload needs to decode AVRO records to perform partial merging of records, so an IOException might be thrown. Previously, the exception was ignored, which could cause problems.

hudi-bot commented 2 days ago

CI report:

Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build