memiiso / debezium-server-iceberg

Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)
Apache License 2.0
199 stars 36 forks source link

How does this consumer handle Toasted Values in PostgreSQL? #196

Closed ndrluis closed 3 weeks ago

ndrluis commented 1 year ago

Hello,

I'm implementing CDC with Iceberg, and currently, I have several columns that are JSONB and in some UPDATE events, when they are not updated, they end up returning with the placeholder indicating that it is a TOASTED column.

Is there any way to handle these columns on the consumer side?

https://debezium.io/blog/2019/10/08/handling-unchanged-postgres-toast-values/

ismailsimsek commented 1 year ago

Hi @ndrluis

i cant think of a way to handle it efficiently. current implementation is deleting full row and overriding with new one fully using iceberg api. this means data value will be lost/overridden.

probably best to handle them with post processing. running it with append only mode.

github-actions[bot] commented 4 weeks ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.