Closed cliu587 closed 8 years ago
@cliu587 LGTM
Can you figure out what is with the build though ?
According to https://travis-ci.org/coursera/dataduct/branches, the develop
branch build is broken, and the failures for this build are the same as develop
. I will take a look at them this weekend.
@cliu587 @sb2nov In relation to the above ticket, was the issue resolved? I am trying to split the extract_rds
tsv file into n parts. I have tried changing the hardcoded variable split
in extract_rds
and adding split
= n to the config file as an additional parameter in the ETL section but when I view the s3node0&1 output, both folders only contain one file. What is the correct way to split files for the create-load-redshift
function?
When outputting the result of a RDS query to S3, it is often useful to split the output to equal sized files. For example, loading into Redshift equal sized files in the number of slices is much more efficient. To support, this we add a
splits
parameter tocreate-load-redshift
that allows the output ofextract-rds
step to be split.PTAL @sb2nov