bernadinm / tf_dcos_core

A Terraform module to install, upgrade, and modify nodes for DC/OS clusters.
Apache License 2.0
11 stars 8 forks source link

How to deal with Mesos attributes #12

Open jbfarez opened 7 years ago

jbfarez commented 7 years ago

Hi there,

First, thanks for this really nice project! I will use this project to setup a new cluster but I just wondering how to dynamically setup Mesos attributes to my slaves (eg: fetch AWS metadatas to add Region as an attribute).

Did you someone already dealt with the combo tf_dcos_core and Mesos attributes?

Thanks

bernadinm commented 7 years ago

Hello @jbfarez, thank you for creating this. After speaking with the team that helps with the dcos_generate_config.sh, they mentioned that the config.yaml will one day have the ability to set mesos attributes. Also in DC/OS 1.11, there will a first class citizen flag called fault domains which sets the region/availability-zone as a standard flag. This means that this project tf_dcos_core will also have that feature as well.

In the short term, I will be looking to come up with an alternative solution to help users set other mesos attributes using terraform. I will keep this issue open until that solution exist and is available for use in another project/repo.

Thank you again for opening this!

jbfarez commented 7 years ago

Actually, the workaround I've found is to write attributes to: /var/lib/dcos/mesos-slave-common by using setup.sh scripts (I made 1 script per node type, master, agent, public agent).

Here is an example:

#!/bin/sh

# Define vars
privateIP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)
zone=$(curl http://169.254.169.254/latest/meta-data/placement/availability-zone)
instanceId=$(curl http://169.254.169.254/latest/meta-data/instance-id)
instanceType=$(curl http://169.254.169.254/latest/meta-data/instance-type)
instanceAmi=$(curl http://169.254.169.254/latest/meta-data/ami-id)

# Initial routine
sudo systemctl disable locksmithd
sudo systemctl stop locksmithd
sudo systemctl restart docker # Restarting docker to ensure its ready. Seems like its not during first usage.

# Add MESOS_ATTRIBUTES
mesosConfigDir="/var/lib/dcos"
mesosConfigFile="$mesosConfigDir/mesos-slave-common"
[[ -d $mesosConfigDir ]] && echo "DC/OS Mesos config directory exists" || sudo mkdir -p $mesosConfigDir
sudo cat << EOF > $mesosConfigFile
MESOS_ATTRIBUTES=role:default;local-ip:$privateIP;zone:$zone;instance:$instanceId;instance-type:$instanceType;ami:$instanceAmi
EOF

Hope this workaround could help.

PS: BTW @bernadinm, I've rewrote the TF manifests to support Autoscaling groups. I'll made it public as module in few days. If you're interested to integrate it, you can ping me on DC/OS community Slack (@jibek).