Closed Limess closed 4 years ago
This makes much sense and we'd like to add/merge the same functionalities as this one and #32 to target-snowflake.
Adding min/max chunk sizes makes lot of sense as well but like you said generating files dynamically is a bit tricky. I think for now defining static slices still beneficial and later we can think about how to use it together with a potential new chunk size parameter(s) and how to detect slices dynamically.
Please review the comments I added to the PR and I'm looking forward to merge it
This allows the user to configure the number of chunks that files will be loaded in. This should improve parallel loading.
It may be sensible to also add a minimum chunk size, and a maximum chunk size. The recommended minimum/maximum sizes are 1MB/256MB compressed, however I'm not sure how to best implement this automatically in a quick and sensible way, especially after also adding compression.
Reference: https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-use-multiple-files.html