splits our elasticsearch cluster across two availability zones (for better redundancy)
splits our elasticsearch cluster across "daytime" and "fulltime" nodes (for potential elasticity and scaling)
add a logsearch elasticsearch node attribute to be used for replica allocations
better support elasticsearch mode by auto-scaling group (master vs data vs client vs any)
disable master capabilities on the auto-scaling data nodes; only the frontend can now be the master
update rake deploy_aws_cloudformation_stack to require the service name instead of hard-coding it in configs
add aws-cloudformation-cluster-resize script for potentially scaling daytime nodes
bugfix, improve fabric file to respect Environment/Service/Name tags
./bin/aws-cloudformation-cluster-resize
The script currently works in terms of scaling up/down, but it's not active in a cron job or anything. I'll add more discussion notes on our Tuesday issue. This script has an invocation which looks like:
lookup the ElasticsearchDaytimeGroup stack (based on the environment and service info)
compare the stack's current GroupDesiredCapacity parameter to know whether it'll be scaling up or down to reach 2 nodes
it'll query the cluster to lookup the running nodes (using the logsearch elasticsearch node attribute to figure out if daytime_elastic nodes are coming or going)
then it'll use the second 2 to update the cluster to indicate there should be two replicas of every shard (one for each AZ and one on the daytime nodes)
And the output and details look like...
2014-02-03T22:08:32 + validating cloudformation references...
2014-02-03T22:08:34 > RootStack: dev-logsearch-dpb587-test1 (2014-02-04T01:09:29.876Z)
2014-02-03T22:08:35 > Target: arn:aws:cloudformation:eu-west-1:123456789012:stack/dev-logsearch-dpb587-test1-ElasticsearchDaytimeGroup-0EA284E003BA/15f91154-1460-4d2c-82bd-ceefec5065a1
2014-02-03T22:08:38 > GroupDesiredCapacity: 0
2014-02-03T22:08:38 - validated cloudformation references
2014-02-03T22:08:38 = we will be scaling up
2014-02-03T22:08:39 + discovering nodes...
2014-02-03T22:08:39 > node 8xjZzm88S_O33n_OXqYqFw (10.236.89.176) is active
2014-02-03T22:08:39 > node UoskahN2T7uUtM993k_FCg (10.64.202.67) is active
2014-02-03T22:08:39 - discovered nodes
2014-02-03T22:08:39 + disabling allocations...
2014-02-03T22:08:39 - disabled allocations
2014-02-03T22:08:39 + updating replication requirements...
2014-02-03T22:08:40 - updated replication requirements
2014-02-03T22:08:40 + updating stack...
2014-02-03T22:08:45 > {"StackId":"arn:aws:cloudformation:eu-west-1:123456789012:stack/dev-logsearch-dpb587-test1-ElasticsearchDaytimeGroup-0EA284E003BA/15f91154-1460-4d2c-82bd-ceefec5065a1"}
2014-02-03T22:08:45 - updated stack
2014-02-03T22:08:45 + nodes are not yet ready...
2014-02-03T22:14:46 > node DZII4pCsRvq6BTvFSA65Mg (10.236.87.214) joined the cluster
2014-02-03T22:15:30 > node F48yg2soQJuX7TBCnh72fA (10.236.254.221) joined the cluster
2014-02-03T22:15:30 - nodes are ready
2014-02-03T22:15:30 + enabling allocations...
2014-02-03T22:15:33 - enabled allocations
2014-02-03T22:15:33 + cluster is not yet "green"...
2014-02-03T22:17:49 - cluster is 'green'
When scaling down...
$ ./bin/aws-cloudformation-cluster-resize ElasticsearchDaytimeGroup GroupDesiredCapacity 0 daytime_elastic 1
2014-02-03T22:31:36 + validating cloudformation references...
2014-02-03T22:31:38 > RootStack: dev-logsearch-dpb587-test1 (2014-02-04T01:09:29.876Z)
2014-02-03T22:31:40 > Target: arn:aws:cloudformation:eu-west-1:123456789012:stack/dev-logsearch-dpb587-test1-ElasticsearchDaytimeGroup-0EA284E003BA/15f91154-1460-4d2c-82bd-ceefec5065a1
2014-02-03T22:31:43 > GroupDesiredCapacity: 2
2014-02-03T22:31:43 - validated cloudformation references
2014-02-03T22:31:43 = we will be scaling down
2014-02-03T22:31:44 + discovering nodes...
2014-02-03T22:31:44 > node 8xjZzm88S_O33n_OXqYqFw (10.236.89.176) is active
2014-02-03T22:31:44 > node UoskahN2T7uUtM993k_FCg (10.64.202.67) is active
2014-02-03T22:31:44 > node DZII4pCsRvq6BTvFSA65Mg (10.236.87.214) will be terminated
2014-02-03T22:31:44 > node F48yg2soQJuX7TBCnh72fA (10.236.254.221) will be terminated
2014-02-03T22:31:44 - discovered nodes
2014-02-03T22:31:44 + reviewing allocations...
2014-02-03T22:31:45 - reviewed allocations
2014-02-03T22:31:45 + disabling allocations...
2014-02-03T22:31:45 - disabled allocations
2014-02-03T22:31:45 + updating stack...
2014-02-03T22:31:48 > {"StackId":"arn:aws:cloudformation:eu-west-1:123456789012:stack/dev-logsearch-dpb587-test1-ElasticsearchDaytimeGroup-0EA284E003BA/15f91154-1460-4d2c-82bd-ceefec5065a1"}
2014-02-03T22:31:48 - updated stack
2014-02-03T22:31:48 + nodes are not yet ready...
2014-02-03T22:32:23 > node DZII4pCsRvq6BTvFSA65Mg (10.236.87.214) left the cluster
2014-02-03T22:32:34 > node F48yg2soQJuX7TBCnh72fA (10.236.254.221) left the cluster
2014-02-03T22:32:34 - nodes are ready
2014-02-03T22:32:34 + updating replication requirements...
2014-02-03T22:32:37 - updated replication requirements
2014-02-03T22:32:37 + enabling allocations...
2014-02-03T22:32:38 - enabled allocations
2014-02-03T22:32:38 + cluster is not yet 'green'...
2014-02-03T22:32:59 + cluster is 'green'
Summary of changes
logsearch
elasticsearch node attribute to be used for replica allocationsrake deploy_aws_cloudformation_stack
to require the service name instead of hard-coding it in configsaws-cloudformation-cluster-resize
script for potentially scaling daytime nodes./bin/aws-cloudformation-cluster-resize
The script currently works in terms of scaling up/down, but it's not active in a cron job or anything. I'll add more discussion notes on our Tuesday issue. This script has an invocation which looks like:
For example, the following invocation...
Will do the following...
ElasticsearchDaytimeGroup
stack (based on theenvironment
andservice
info)GroupDesiredCapacity
parameter to know whether it'll be scaling up or down to reach2
nodeslogsearch
elasticsearch node attribute to figure out ifdaytime_elastic
nodes are coming or going)2
to update the cluster to indicate there should be two replicas of every shard (one for each AZ and one on the daytime nodes)And the output and details look like...
When scaling down...