getdozer / dozer

Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
https://getdozer.io
GNU Affero General Public License v3.0
1.47k stars 117 forks source link

Aerospike sink: Add transactionally consistent denormalization #2437

Closed Jesse-Bakker closed 4 months ago

Jesse-Bakker commented 4 months ago

This is based on the aerospike resumability branch.

I added a configuration to the aerospike sink table config, which specifies where to write the denormalized version of that table to, called write_denormalized_to. All tables will also be written in their raw form.

Resumability is implemented using a two-phase approach: First, write the denormalized tables, then write a txid for the denorm tables Second, write the lookup tables, then write a txid for the lookup tables

If we go down between writing the denorm txid and the lookup txid, we skip denorm until we reach the lookup txid.