Open warhammerkid opened 8 years ago
Here's a Gist of the Packer script I'm using: https://gist.github.com/warhammerkid/35a49f29d15d87765349
Some other steps require dataduct being installed as well - the best solution will be to create custom AMI as it will speed up instance startup. Other way will be to create bootstrap step with all commands to bring installed, for example:
bootstrap:
ec2:
- step_type: transform
input_node: []
command: sudo yum update -y;sudo yum install -y gcc gcc-c++ mysql56-devel MySQL-python27 postgresql94-devel graphviz python-devel s3cmd;sudo pip install dataduct;aws s3 cp s3://my_backet/config/dataduct.cfg ~/.dataduct/dataduct.cfg
no_output: true
On side note in Pypi only 0.4.0 version of dataduct.
The
create-load-redshift
step requires that the EC2 instance has dataduct installed and configs synced from S3, however there is no documentation anywhere detailing this necessity. For my purposes I have created a simple Packer script to build an AMI with the necessary dependencies. A tiny config file needs to be created and placed at.dataduct/dataduct.cfg
so thatsync_from_s3
will actually run.Then you can simply put something like the following in your config file:
It would be nice if this was all done automatically, but at a bare minimum it would help to have some documentation pointing people in the right direction.