HHS / simpler-grants-gov

https://simpler.grants.gov
Other
29 stars 9 forks source link

[Issue #1978] further optimizations for load chunking #1984

Closed jamesbursa closed 1 month ago

jamesbursa commented 1 month ago

Summary

Fixes #1978

Time to review: 5 mins

Changes proposed

Context for reviewers

Instead of using LIMIT to carry out chunking in PostgreSQL, read the full set of ids as a first step, then issue a series of INSERT / UPDATE queries.

This is expected to be faster. With the previous method, the PostgreSQL optimizer did not do an ideal plan, and did a full read of all rows and columns from the Oracle database. By splitting the query, we can do a read of only the id columns for the new or updated rows. Then additional queries select only the rows that have changed.

Additional information

N/A