Open 65278 opened 1 year ago
I think the ideal Postgres producer here would read from a logical replication stream and you could make it consistent by only updating the flush LSN when having processed some things. Maybe a good option here would be a new pglogrep producer that uses pgoutput to grab changesets for specific publications?
The Kafka producer has a strong consistency mode (a callback which commits offsets, coupled with strong semantics on offset commits). Unfortunately, the postgres producer has no consistency at all. If COPY fails, all the messages in the batch are discarded, and yet the commit callback already ran. Therefore, we need to defer the callbacks until commit actually ran. On top of this, there's no check or action when copy fails. All messages in the batch are lost at the moment.