Open AshishKhatkar opened 1 month ago
Maybe we can add a few unit tests and some docs for the new options?
@JingsongLi could you please help review this? We are currently running this patch internally for Yelp as we cannot afford to block all ingestion based on one "corrupt" message. Corrupt messages entering our stream is rare but inevitable (e.g. older schemas that are not backwards compatible) and it is better for us to log and skip them (or even fail the app) rather than block.
Purpose
This change adds support for retry count and skipping corrupt records instead of assuming a schema change has happened and waiting indefinitely on schema change. This change will cause the Flink cdc job to fail loudly due to a corrupt message (alternatively users have an option to use config option to log and skip such messages) instead of waiting indefinitely on schema change which might not have happened. For more details check the linked issue.
Linked issue: close #4239
Tests
API and Format
Documentation