PicnicSupermarket / diepvries

The Picnic Data Vault framework.
https://diepvries.picnic.tech
MIT License
126 stars 15 forks source link

Exclude more recent records from satellites based on hashkey or driving key #42

Closed dlouseiro closed 1 year ago

dlouseiro commented 1 year ago

The purpose of this PR is to adapt the code of regular and effectivity satellites to ignore records in the staging table when more recent records exist in the target table for the same hashkey (or driving key in case of effectivity satellites).

The previous version of the code is ignoring the records from the staging table when any new record exists in the target satellite, independently on whether that record has the same key or not, which causes issues when running processes in parallel.

The requirement still stands, as in, it is not intended for us to load "older versions" of records when new ones already exist in the target satellite, but this is only applicable for records with overlapping keys.

Example of a capturing process that can cause issues:

dlouseiro commented 1 year ago

Let's update the changelog too ;)

True! Forgot that

dlouseiro commented 1 year ago

Done @matthieucan