memiiso / debezium-server-iceberg

Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)
Apache License 2.0
174 stars 35 forks source link

How does this consumer handle Toasted Values in PostgreSQL? #196

Open ndrluis opened 1 year ago

ndrluis commented 1 year ago

Hello,

I'm implementing CDC with Iceberg, and currently, I have several columns that are JSONB and in some UPDATE events, when they are not updated, they end up returning with the placeholder indicating that it is a TOASTED column.

Is there any way to handle these columns on the consumer side?

https://debezium.io/blog/2019/10/08/handling-unchanged-postgres-toast-values/

ismailsimsek commented 1 year ago

Hi @ndrluis

i cant think of a way to handle it efficiently. current implementation is deleting full row and overriding with new one fully using iceberg api. this means data value will be lost/overridden.

probably best to handle them with post processing. running it with append only mode.