MeltanoLabs / tap-snowflake

Other
4 stars 10 forks source link

feat: ability to query table and column stats; fix: missing bookmarks for native batch-based sync #13

Closed aaronsteers closed 1 year ago

aaronsteers commented 1 year ago

I took an initial pass at this over the vacation. I don't know if I'll be able to finish it out, but we can at least start the discussions here.

Resolves:

Besides the new "Table and Column Stats" implementation via SQLConnector.get_table_profile(), the meat of the change is in SQLStream._sync_batches(). The proposed implementation queries for max column value before starting the batch unload, then after the load finishes, it increments the state with the pre-queried max value.

Function signatures did not need to change, and all of this fix could be implemented into the SDK with no target-specific code needing to be written, except what is already done in this implementation, which is to use the bookmark's value (when it is available) in the WHERE clause of the COPY operation.

tayloramurphy commented 1 year ago

@aaronsteers @kgpayne what needs to happen to get this merged?

kgpayne commented 1 year ago

@tayloramurphy final review from @edgarrmondragon would do it. @aaronsteers can't review his own PR, and I made a bunch of changes to get AJ's excellent first pass to fully work 🙂 Will add my review for the sake of GH ✅