Closed kotfic closed 8 years ago
To test copy the following into dev/vagrant.local.yml
domain: "cluster.dev"
ansible:
verbose: "v"
plays:
- playbook: "playbooks/hadoop-hdfs/site.yml"
nodes:
head:
memory: 8192
cpus: 2
roles:
- namenodes
- datanodes
data-01:
memory: 8192
cpus: 2
roles:
- datanodes
Adjust memory and cpus as needed, then test with exported values for aws keys, eg:
export AWS_ACCESS_KEY_ID=FOO
export AWS_SECRET_ACCESS_KEY=BAR
/opt/hadoop/2.7.1/etc/hadoop/core-site.xml
should contain properties that defined access key and secrete key.
+1 I like this idea!
Just fyi, you can access s3 using s3n://KEY_ID:SECRET_KEY@BUCKET_PATH
. But I agree that configuring it would be nice.
On the other hand, I never liked the idea of storing your AWS credentials in plain text on your own system, much less on an ec2 instance. Once there, you've given up on the fight for physical security of your credentials. I don't think this behavior should be the default.
Might I suggest that we hold off on this idea until we can get an aws-credentials
role put together that can actually parse the config file (#24)? Then, the hdfs
roles can pull that in as a dependency and implement the change you propose, here. And for security, I'd recommend defaulting to a no op if a user of the hdfs
roles provides no aws profile.
This sounds reasonable to me, in the mean time i will use the s3n://KEY_ID:SECRET_KEY@BUCKET_PATH
format.
Add AWS creds to the hadoop hdfs
core-site.xml
file if they are defined in the environment that runs ansible.This is necessary for accessing objects stored on s3 with
bin/hadoop