airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.74k stars 3.79k forks source link

Add `_ab_cdc_txid` to CDC output #28174

Open gpind opened 11 months ago

gpind commented 11 months ago

What area the feature impact?

Connectors

Revelant Information

For audit logs, it's essential to associate transaction metadata with CDC events. For example, an application might write user IDs or request IDs to a transaction_metadata table every time it opens a transaction, with the transaction's ID as primary key. Then if we had a field like _ab_cdc_txid we could tie that metadata to any CDC event.

I don't know which sources have a concept of a "transaction ID", but Postgres does at a minimum (Debezium exposes it as txId; see here).

gpind commented 11 months ago

Thoughts? I can start on a PR if people agree it's a good idea, though I could also use a hint about where to start.

gpind commented 10 months ago

Ping. This blocks us from using Airbyte for CDC.