Closed aaronsteers closed 2 years ago
@kgpayne - If this turns out to be difficult, totally ok to postpone for a future iteration. It'd be worth a moderate investment but not not worth delaying the batch PRs themselves, if that's helpful.
Presumably you'd seed the exercise by running something like meltano run tap-stackoverflow-sample target-snowflake
- but if you run into any problems getting sample data loaded, that could be a potential blocker/slowdown here. (No don't we'll work out any kinks over time.)
Closing as resolved. At least for now, this should be sufficient: https://github.com/meltano/sdk/discussions/906#discussioncomment-3955177
It would be awesome to create a benchmark for the Snowflake connectors, with and without BATCH, and specifically on datasets that would most benefit from
BATCH
as a high-throughput optimization.Per:
The datasets included are:
If think the specific data I'd love to see for this...
Q: How quickly can we sync any one of the provided sample streams from
tap-snowflake
totarget-snowflake
:tap-snowflake --config=tap-config.json | target-snowflake --config=target-config.json
tap-snowflake --config=tap-config.json --config=batch-config.json | target-snowflake --config=target-config.json
tap-snowflake --config=tap-config.json > runresults.singer.jsonl
cat runresults.singer.jsonl | target-snowflake --config=target-config.json