Adds an index on the id columns (and makes that column mandatory) and deletes record with the same resource IDs before adding them to the DB. This is required for #916 to be able to update old view rows.
E2E test
TESTED:
Ran the pipeline for both small and large datasets (~16M observations). The impact of the index/deletion on performance seems to be ~3% or less (depending on various scenarios tested).
Description of what I changed
Adds an index on the
id
columns (and makes that column mandatory) and deletes record with the same resource IDs before adding them to the DB. This is required for #916 to be able to update old view rows.E2E test
TESTED:
Ran the pipeline for both small and large datasets (~16M observations). The impact of the index/deletion on performance seems to be ~3% or less (depending on various scenarios tested).
Checklist: I completed these to help reviewers :)
[x] I have read and will follow the review process.
[x] I am familiar with Google Style Guides for the language I have coded in.
No? Please take some time and review Java and Python style guides.
[x] My IDE is configured to follow the Google code styles.
No? Unsure? -> configure your IDE.
[ ] I have added tests to cover my changes. (If you refactored existing code that was well tested you do not have to add tests)
[x] I ran
mvn clean package
right before creating this pull request and added all formatting changes to my commit.[x] All new and existing tests passed.
[x] My pull request is based on the latest changes of the master branch.
No? Unsure? -> execute command
git pull --rebase upstream master