zalando-nakadi / nakadi-producer-spring-boot-starter

Nakadi event producer as a Spring boot starter
MIT License
13 stars 8 forks source link

Integrate with Fabric Event Streams #170

Open ePaul opened 1 year ago

ePaul commented 1 year ago

Background Fabric Event Streams (FES) is a (Zalando-internal) solution (internal link) for change data capture, which (in its original and most common configuration) reads a postgresql database's write-ahead log (e.g. of an outbox table) and sends the DB changes out as Nakadi events, giving an alternative implementation of the outbox pattern implemented by this library. FES is configured via a custom K8s resource (with some extensibility allowed via AWS lambdas).

Idea The idea here is to allow using this library's EventLogWriter interface to create the eventlog entries, but then use FES to send them out (skipping the whole scheduler + Nakadi interface).

Details:

Benefits

ePaul commented 1 year ago

To investigate:

ePaul commented 1 year ago

A discussion in the (Zalando-internal) FES chat room showed two options:

a. no change in FES:

b. minimal changes to Nakadi-producer:

In both cases we'll need to think careful about the roll-out sequence, especially when changing from an existing setup.

ePaul commented 1 year ago

Trying to implement this now (variant (b)).

I mostly got the FES part done, and disabling the sending out is also easy (we already got a property for that).

The surprisingly difficult part is in the "do a delete after the insert": Our repository.delete() methods are using the id, but the persist methods don't return the id. This is caused by the way it's implemented – JDBC's update and batchUpdate can only return a number of affected rows.

I guess we could try to use query (and INSERT INTO ... RETURNING id or similar), but that doesn't support batching (except for DB2's JDBC driver, it seems).

Maybe we can do the INSERT and DELETE immediately in the same statement?

WITH ids AS (
   INSERT INTO nakadi_producer_eventlog (....)
   VALUES (...)
   RETURNING id
)
DELETE FROM nakadi_producer_eventlog as el
 USING ids
  WHERE ids.id =  el.id

... if that works (i.e. if this actually inserts and immediately deletes a row, leaving two WAL entry but nothing in the table), this could certainly be batched.

ePaul commented 1 year ago

The idea mentioned above doesn't work – the DELETE doesn't see the effects of the INSERT, so it doesn't delete anything:

The sub-statements in WITH are executed concurrently with each other and with the main query. Therefore, when using data-modifying statements in WITH, the order in which the specified updates actually happen is unpredictable. All the statements are executed with the same snapshot (see Chapter 13), so they cannot “see” one another's effects on the target tables. This alleviates the effects of the unpredictability of the actual order of row updates, and means that RETURNING data is the only way to communicate changes between different WITH sub-statements and the main query.

(From 7.8.4. Data-Modifying Statements in WITH, emphasis mine.)

Here is a list of other options to look at:

ePaul commented 1 year ago

In #173 + #174 I'm exploring how the "manual batching" approach might work. At least in a test it seems to properly delete the items immediately. I'll need to plug this together to see how it works end-to-end.