meltano / sdk

Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
https://sdk.meltano.com
Apache License 2.0
94 stars 68 forks source link

feat: make utilities for creating record batches from `Stream` available to `Sink` classes #1026

Closed kgpayne closed 2 weeks ago

kgpayne commented 1 year ago

Feature scope

Targets (data type handling, batching, SQL object generation, etc.)

Description

Currently the utilities for creating batch files from lists of records is only available on Stream classes (and descendants). In implementing target-snowflake, I wished to overload bulk_insert_records (as is expected for database specific optimisations) to use the same mechanism as process_batch_files for bulk loading Snowflake via an internal stage. However bulk_insert_records receives a list of Record payloads, and process_batch_files expects a list of file URI's. Therefore, to create the necessary URI's, I reached for the helper methods on the Stream class implemented to support the creation of BATCH messages in the Tap.

I propose we:

  1. Refactor to make the utilities for serialising records into files reusable in both Taps and Targets
  2. Modify bulk_insert_records on the Sink class to use process_batch_files (or visa versa) so that developers need only implement 1 bulk insert method that is used regardless of whether the Target receives records or batches
stale[bot] commented 1 year ago

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

stale[bot] commented 1 month ago

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.