A full-lifecycle, immutable cloud infrastructure cluster management role, using Ansible.
redeploy.yml
playbook will replace each node in the cluster (via various redeploy schemes), and rollback if any failures occur. clusterverse is designed to manage base-vm infrastructure that underpins cluster-based infrastructure, for example, Couchbase, Kafka, Elasticsearch, or Cassandra.
Contributions are welcome and encouraged. Please see CONTRIBUTING.md for details.
Dependencies are managed via pipenv:
pipenv install
will create a Python virtual environment with dependencies specified in the PipfileTo active the pipenv:
pipenv shell
pipenv run
cluster_vars[buildenv].aws_access_key:
cluster_vars[buildenv].aws_secret_key:
cluster_vars[buildenv].vpc_name: my-vpc-{{buildenv}}
a
, b
, c
).
cluster_vars[buildenv].vpc_subnet_name_prefix: my-subnet-{{region}}
cluster_vars[buildenv].key_name: my_key__id_rsa
IAM & Admin
/ Service Accounts
. Download the json file locally.cluster_vars[buildenv].gcp_service_account_rawtext
variable.
pipenv install
DNS is optional. If unset, no DNS names will be created. If DNS is required, you will need a DNS zone delegated to one of the following:
Credentials to the DNS server will also be required. These are specified in the cluster_vars
variable described below.
Clusters are defined as code within Ansible yaml files that are imported at runtime. Because clusters are built from scratch on the localhost, the automatic Ansible group_vars
inclusion cannot work with anything except the special all.yml
group (actual groups
need to be in the inventory, which cannot exist until the cluster is built). The group_vars/all.yml
file is instead used to bootstrap _mergevars.
Clusterverse is designed to be used to deploy the same clusters in multiple clouds and multiple environments, potentially using similar configurations. In order to avoid duplicating configuration (adhering to the DRY principle), a new action plugin has been developed (called merge_vars
) to use in place of the standard include_vars
, which allows users to define the variables hierarchically, and include (and potentially override) those defined before them. This plugin is similar to include_vars
, but when it finds dictionaries that have already been defined, it combines them instead of replacing them.
- merge_vars:
ignore_missing_files: True
from: "{{ merge_dict_vars_list }}" #defined in `group_vars/all.yml`
In the case of a fully hierarchical set of cluster definitions where each directory is a variable, (e.g. cloud (aws or gcp), region (eu-west-1) and _clusterid (test)), the folders may look like:
|-- aws
| |-- eu-west-1
| | |-- sandbox
| | | |-- test
| | | | `-- cluster_vars.yml
| | | `-- cluster_vars.yml
| | `-- cluster_vars.yml
| `-- cluster_vars.yml
|-- gcp
| |-- europe-west1
| | `-- sandbox
| | |-- test
| | | `-- cluster_vars.yml
| | `-- cluster_vars.yml
| `-- cluster_vars.yml
|-- app_vars.yml
`-- cluster_vars.yml
group_vars/all.yml
would contain merge_dict_vars_list
with the files and directories, listed from top to bottom in the order in which they should override their predecessor:
merge_dict_vars_list:
- "./cluster_defs/cluster_vars.yml"
- "./cluster_defs/app_vars.yml"
- "./cluster_defs/{{ cloud_type }}/"
- "./cluster_defs/{{ cloud_type }}/{{ region }}/"
- "./cluster_defs/{{ cloud_type }}/{{ region }}/{{ buildenv }}/"
- "./cluster_defs/{{ cloud_type }}/{{ region }}/{{ buildenv }}/{{ clusterid }}/"
It is also valid to define all the variables in a single sub-directory:
cluster_defs/
|-- test_aws_euw1
| |-- app_vars.yml
| +-- cluster_vars.yml
+-- test_gcp_euw1
|-- app_vars.yml
+-- cluster_vars.yml
In this case, merge_dict_vars_list
would be only the top-level directory (using cluster_id
as a variable). merge_vars
does not recurse through directories.
merge_dict_vars_list:
- "./cluster_defs/{{ clusterid }}"
If merge_dict_vars_list
is not defined, it is still possible to put the flat variables in /group_vars/{{cluster_id}}
, where they will be imported using the standard include_vars
plugin.
This functionality offers no advantages over simply defining the same cluster yaml files in the directory structure defined in merge_dict_vars_list - flat
merge_vars technique above, and that is considered preferred.
Credentials can be encrypted inline in the playbooks using ansible-vault.
.vaultpass-client.py
) that returns a password stored in an environment variable (VAULT_PASSWORD_BUILDENV
) to ansible. Setting this variable is mandatory within Clusterverse as if you need to decrypt sensitive data within ansible-vault
, the password set within the variable will be used. This is particularly useful for running within Jenkins.
export VAULT_PASSWORD_BUILDENV=<'dev/stage/prod' password>
.vaultpass-client.py
and VAULT_PASSWORD_BUILDENV
has been set:
ansible-vault encrypt_string --vault-id=sandbox@.vaultpass-client.py --encrypt-vault-id=sandbox
aws_secret_key
. When running the cmd above, a prompt will appear such as:
ansible-vault encrypt_string --vault-id=sandbox@.vaultpass-client.py --encrypt-vault-id=sandbox
Reading plaintext input from stdin. (ctrl-d to end input)
CTRL-D
on your keyboard. Sometimes scrambled text will appear after pressing the combination such as ^D
, press the same combination again and your scrambled hash will be displayed. Copy this as a value for your string within your cluster_vars.yml
or app_vars.yml
files. Example below:
aws_secret_key: !vault |-
$ANSIBLE_VAULT;1.2;AES256;sandbox
7669080460651349243347331538721104778691266429457726036813912140404310
!vault |-
this is compulsory in order for the hash to be successfully decryptedVAULT_PASSWORD_BUILDENV
and just debug: msg={{myvar}}
, or:
echo '$ANSIBLE_VAULT;1.2;AES256;sandbox
86338616...33630313034' | ansible-vault decrypt --vault-id=sandbox@.vaultpass-client.py
echo '$ANSIBLE_VAULT;1.2;AES256;sandbox
86338616...33630313034' | ansible-vault decrypt --ask-vault-pass
clusterverse is an Ansible role, and as such must be imported into your \<project>/roles directory. There is a full-featured example in the /EXAMPLE subdirectory.
To import the role into your project, create a requirements.yml
file containing:
roles:
- name: clusterverse
src: https://github.com/sky-uk/clusterverse
version: master ## branch, hash, or tag
If you use a cluster.yml
file similar to the example found in EXAMPLE/cluster.yml, clusterverse will be installed from Ansible Galaxy automatically on each run of the playbook.
To install it manually: ansible-galaxy install -r requirements.yml -p /<project>/roles/
For full invocation examples and command-line arguments, please see the example README.md
The role is designed to run in two modes:
cluster.yml
sub-role immutably deploys a cluster from the config defined above. If it is run again (with no changes to variables), it will do nothing. If the cluster variables are changed (e.g. add a host), the cluster will reflect the new variables (e.g. a new host will be added to the cluster. Note: it will not remove nodes, nor, usually, will it reflect changes to disk volumes - these are limitations of the underlying cloud modules).redeploy.yml
sub-role will completely redeploy the cluster; this is useful for example to upgrade the underlying operating system version.canary
deploys. The canary
extra variable must be defined on the command line set to one of: start
, finish
, filter
, none
or tidy
.mainclusteryml
: This is the name of the deployment playbook. It is called to deploy nodes for the new cluster, or to rollback a failed deployment. It should be set to the value of the primary deploy playbook yml (e.g. cluster.yml
)predeleterole
: This is the name of a role that should be called prior to deleting VMs; it is used for example to eject nodes from a Couchbase cluster. It takes a list of hosts_to_remove
VMs. predeleterole
cluster_suffix
remains the same).canary=start
, only the first node is redeployed. If canary=finish
, only the remaining (non-first), nodes are redeployed. If canary=none
, all nodes are redeployed.canary=filter
, you must also pass canary_filter_regex=regex
where regex
is a pattern that matches the hostnames of the VMs that you want to target.predeleterole
on the previous nodecanary=start
, only the first node is redeployed. If canary=finish
, only the remaining (non-first), nodes are redeployed. If canary=none
, all nodes are redeployed.canary=filter
, you must also pass canary_filter_regex=regex
where regex
is a pattern that matches the hostnames of the VMs that you want to target.canary=start
or canary=none
canary=finish
or canary=none
:
predeleterole
is called with a list of the old VMs.canary=filter
, an error message will be shown is this scheme does not support it.predeleterole
canary=start
, only the first node is redeployed. If canary=finish
, only the remaining (non-first), nodes are replaced. If canary=none
, all nodes are redeployed.canary=filter
, you must also pass canary_filter_regex=regex
where regex
is a pattern that matches the hostnames of the VMs that you want to target.predeleterole
on the nodecanary=start
, only the first node is shut-down. If canary=finish
, only the remaining (non-first), nodes are shutdown. If canary=none
, all nodes are shut-down.