Refactor insert pattern for GitHub transform and load

Summary

Currently, the pattern adopting for upserting data in etldb/main.py is to:

Filter for the unique set of Github IDs
Iterate through that list of IDs
Then filter the dataset again to get data for a given ID

See the following sections for examples:

A more common pattern would be to select the distinct set of values you want to insert using pandas DataFrame.drop_duplicates() method and then iterate through the list of data inserting each row or preferably doing a bulk upsert.

In addition to requiring multiple passes through the dataset with $On^k$ complexity (where $n is the size of the source DataFrame and $k$ is the number of unique IDs for a given table) vs a single pass to retrieve all data we want to insert, this pattern also makes it harder to adopt bulk upserts.

Bulk transactions are helpful because they allow us to combine DML into transaction blocks more easily and prevent instances in which tables are only partially updated during a batch data load process.

Acceptance criteria

[ ] We've reduced the number of passes through the dataframe needed to extract the unique set of values to insert into the database
[ ] The data is loaded using bulk DML statements rather than separate insert/update statements for each row of data.

HHS / simpler-grants-gov

Refactor insert pattern for GitHub transform and load #2968

Summary

Acceptance criteria