google / fhir-data-pipes

A collection of tools for extracting FHIR resources and analytics services on top of that data.
https://google.github.io/fhir-data-pipes/
Apache License 2.0
142 stars 82 forks source link

added an ID index and delete old view rows #919

Closed bashir2 closed 6 months ago

bashir2 commented 6 months ago

Description of what I changed

Adds an index on the id columns (and makes that column mandatory) and deletes record with the same resource IDs before adding them to the DB. This is required for #916 to be able to update old view rows.

E2E test

TESTED:

Ran the pipeline for both small and large datasets (~16M observations). The impact of the index/deletion on performance seems to be ~3% or less (depending on various scenarios tested).

Checklist: I completed these to help reviewers :)