Closed sopel closed 11 years ago
This issue confuses me. It seems like two issues:
In terms of those two ideas...
This can be a bit confusing indeed (which might also be increased by my strong preference for handling everything with Auto Scaling, which is why I mention it everywhere, despite not being mandatory for every discussed subject ;) - the following is based on information/deductions from Updating AWS CloudFormation Stacks:
Still managed to miss the point of your question I think, which highlights my indeed confusing specs in the issue description - a corrective proposal:
I was thinking about splitting up the CloudFormation templates into decomposed components. For example, put the different nodes into their own template with their own WaitCondition, Instance, Alarm.
node-common::parameters {
AvailabilityZone = "us-east-1",
ClusterName,
KeyName,
InstanceProfile = "",
InstanceType,
SecurityGroup = "default",
}
node-es-ebs-default
node-es-ephemeral-default
node-broker-default
The node templates could be loaded via CloudFormation on their own (tweaked as an Auto Scaling Group/Node whatever), or they could be nested within a smarter, composite template equivalent to the current ppe-cluster template.
* secgrp-multi
* node-es-ebs-default
* node-es-ephemeral-default
* node-broker-default
* index-lag-alarm
* queue-size-alarm
* r53-zone
@sopel, what do you think about the approach?
Absolutely, I'd like to explore this path as well, should make things more manageable and also better reusable ideally (providing reusable components has been one goal of the - currently somewhat stalled - StackFormation project).
Operating the various templates manually is certainly an option too, but I think it would be preferable (if not required) to retain the 'build in one click' functionality, and while this could also be done via a script, this would forego some of the dependency management (and as of today parallel processing) capabilities of CloudFormation. Therefore I'd appreciate an attempt in composite template usage (though, as mentioned, I lack experience whether this is feasible/useful for complex composition scenarios).
To be clear about that, I'm certainly open for operating parts of the solution separately, as discussed not everything is necessarily tied to a particular stack deployment - for example one might to deploy canonical sub-domains and/or elastic IP addresses separately and only associate them with a specific deployment at runtime to allow hot environment switching. I'm mainly aiming to have a concerted deployment of the strongly connected tiers.
Considered Done due to goal 1. (CloudFormation stack updates) being available as such.
I have a question about this... I was in the process of testing this and a lot of things didn't go as I expected. I...
1) created and waited for a new CloudFormation stack based off ci-r53,
2) started a logstash task from my local machine to continuously push logs,
3) verified logs were going through,
3) initiated an Update Stack (via web console) and change ElasticsearchEbs0InstanceType
from the default m1.large
to c1.medium
.
Behavior confused me when...
/mnt/app-data
causing normal startup to fail with the missing symlink);I was expecting a nice easy stack update... but it wasn't so. Sad.
@sopel, can you tell me any more, perhaps if I'm missing knowledge or understanding. Or if you have any free time try the process yourself and see if it matches up.
Actually, I'm foolish and have to recant much of that. I think I touched the InstancePostScript
parameter which affects the metadata which I presume would cause restarts. Don't waste time on this; I'll try and test this again more cleanly tomorrow. Whatever the case, there are already some additional changes that this prompts.
@dpb587 - either way your questions highlight that this isn't actually done yet, thanks for pointing that out (slight misunderstanding regarding you setting this to Done and me only doing partial tests rather the obvious one you attempted now); while I expect some of the points you mention to be caused by touching the InstancePostScript
indeed, others like the instance IP addresses mismatch with /app/.env
are probably systemic and might require integration of on instance update handling via cfn-hup
hooks, which is one of the more complex aspects hinted on above.
Whether we should actually do the latter (vs. just documenting how to manually adjust things in case) depends on your findings tomorrow, which I'll await accordingly before touching this again ;)
Logstash DNS Caching Issue: https://logstash.jira.com/browse/LOGSTASH-760
@dpb587 - nice find, I wasn't aware of this JVM default, Jordan Sissel's comment is very much to the point:
Can you try disabling the jvm's dns caching? The default is to cache forever, which is a horrible tragedy of a default.
I've been able to scale up es-ebs, memory-wise, and it works automatically. Frontend/backend realize the new endpoints within two minutes without manual intervention and log events
When scaling down elasticsearch nodes, memory-wise, they start having problems. For example, going from an m1.large
to a c1.medium
. Seems like it's related to hard-coding the ES_HEAP_SIZE
in /app/.env
which occurs during provisioning based on the instance type mapping. By changing it manually and restarting the service, it comes back up without a problem. The following is the error log message:
Error occurred during initialization of VM Could not reserve enough space for object heap
When scaling es-ephemeral, it always runs into issues due to /mnt/app-data
being non-existant. That directory is created during the initial provisioning, but for a scaled/new instance it's no longer present.
When scaling the broker, it works automatically. As long as the local tunnel continues to re-attempt connection to the DNS name, it picks up the change within two minutes and the local shipper re-attempts failed messages.
So, the known problems are now:
ES_HEAP_SIZE
is written during provisioning so it's configurable. I propose that instead of writing that config, our logsearch Rakefile dynamically calculates and exports it (based on ~45% of installed memory) if it's not already defined.$APP_DATA_DIR
and creates a directory if it doesn't exist, accounting for both directories and symlinks.@dpb587 - thanks for the detailed tests/summary, an interesting experience - your proposed refactorings/workarounds sound good to me!
@dpb587 - with regard to the networkaddress.cache.ttl issue and my comment 3924156, OpenJDK seems to indeed default to cache for 30 seconds as long as a security manager is not set , at least as per the (fairly random) diff of changeset 5543 in OpenJDK 7:
+# The Java-level namelookup cache policy for successful lookups:
+#
+# any negative value: caching forever
+# any positive value: the number of seconds to cache an address for
+# zero: do not cache
+#
+# default value is forever (FOREVER). For security reasons, this
+# caching is made forever when a security manager is set. When a security
+# manager is not set, the default behavior in this implementation
+# is to cache for 30 seconds.
+#
+# NOTE: setting this to anything other than the default value can have
+# serious security implications. Do not set it unless
+# you are sure you are not exposed to DNS spoofing attack.
+#
+#networkaddress.cache.ttl=-1
The encountered and documented behavior would thus imply that a security manager is in place, which I doubt? Or are the former tests to disparate/inconclusive to deduce this?
I created a ci-r53 stack, shipping local logs to it once complete. I ran the following sequence of stack updates, waiting between each step to ensure everything came back online:
ElasticsearchEphemeral0InstanceType
: m1.large
→ c1.medium
Broker0InstanceType
: c1.medium
→ m1.large
ElasticsearchEbs0InstanceType
: m1.large
→ m1.xlarge
ElasticsearchEphemeral0InstanceType
: c1.medium
→ m1.medium
Broker0InstanceType
: m1.large
→ m1.small
ElasticsearchEbs0InstanceType
: m1.xlarge
→ c1.medium
All components recovered automatically and without errors. Depending on instance types (i.e. slower instances took slightly longer), it averaged about 150 seconds of "downtime" (i.e. real-time data not streaming) until everything was streaming through in a timely manner. It's great when things work happily.
@sopel, very interesting findings. I re-tested my earlier remarks by reverting the TTL commit on the broker on the stack I was just using, restarting the app-logstash_redis
service, and initiating a DNS-changing update. It seemed to work. I can only suppose that I must not have pushed my change in 57df1f8 to S3 by the time I started that test stack (which has higher DNS TTLs than I was waiting for... but I'm pretty sure I waited ~15 minutes before assuming the failure). If I don't think of an alternative explanation by morning, I'll revert ac5a634 since I can't seem to reproduce it.
@dpb587 - that's great news, thanks for the thorough tests! This means we can now scale vertically on demand, which is a major achievement already :)
Let's see whether we can extend that to automatic horizontal scaling as well via #39 - due to the stack complexity there are probably some subtleties involved ...
This details #115 and depends on #116 (and probably #39) - it should be possible to update the cluster with a single command by means of updating the stack via the existing CFN template, e.g. to adjust instance sizes or replace unhealthy instances.