tl;dr - I would like to add compression options to load-redshift and s3_node. Relevant aws documentation: s3-node and RedshiftCopyActivity
We have a use case at my employer where we have to push some fairly large tables (about 500 GB uncompressed) from mysql => redshift. I created a custom step (based on extract-rds) to compress throughout the pipeline. However, this required some mods to both s3-node and load-redshift. I wanted to pass these options back into the mainline project. PR forthcoming
Also - I'd be happy to contribute the custom step (I called it ExtractMysqlGzip, for lack of a better term). The only reason I did not create a PR for this is - well, the custom step is pretty hacky to get around aws's limitations imposed on s3datanodes that have compression enabled.
tl;dr - I would like to add compression options to load-redshift and s3_node. Relevant aws documentation: s3-node and RedshiftCopyActivity
We have a use case at my employer where we have to push some fairly large tables (about 500 GB uncompressed) from mysql => redshift. I created a custom step (based on extract-rds) to compress throughout the pipeline. However, this required some mods to both s3-node and load-redshift. I wanted to pass these options back into the mainline project. PR forthcoming
Also - I'd be happy to contribute the custom step (I called it ExtractMysqlGzip, for lack of a better term). The only reason I did not create a PR for this is - well, the custom step is pretty hacky to get around aws's limitations imposed on s3datanodes that have compression enabled.