cityindex-attic / logsearch

[unmaintained] A development environment for ELK
Apache License 2.0
24 stars 8 forks source link

Analyze/Implement Auto Scaling for remaining tiers #149

Closed sopel closed 10 years ago

sopel commented 11 years ago

This is an initial umbrella issue (i.e. should probably split into multiple issues once a strategy is in place) - Auto Scaling and EC2 Spot Instance usages has been seeded via #39 and #148, which addressed the most obvious, but also most simple tier.

To recap, there are three main goals with Auto Scaling:

  1. improve availability via health checks ('keep at N' ), i.e. simply ensure that the running instances (can only be a single one too) are healthy and replace them with a new one automatically if not - in addition this eases vertical scaling (which works w/o Auto Scaling already though)
  2. improve performance via horizontal scaling, i.e. scale up and down on schedule or on demand (be it automatically or by simple adjusting the Auto Scaling capacity manually)
  3. decrease cost via EC2 Spot Instance usage, which is only available within CloudFormation via Auto Scaling

All three apply to all our tiers one way or another, but in quite different ways and priorities, which need to be analyzed accordingly.

sopel commented 11 years ago

Moved to Icebox due to low priority and conflicting schedules over the next 2 month.

sopel commented 11 years ago

:exclamation: While AWS has just added Redis support to ElastiCache (see #169), it neither supports Cache Node Auto Discovery) nor adding/removing nodes to a cluster yet, see Adding or Removing Cache Nodes:

Note At this time, you can only add or remove cache nodes from cache clusters running Memcached.

So this only addresses resiliency at this point, but doesn't help with (auto) scaling that much, insofar one needs to replace the entire cluster in case (still accessible via Creating a Redis Snapshot and Seeding a New Cache Cluster With a Redis Snapshot in turn though).

sopel commented 10 years ago

Unfrozen due to increasing demand, thus desire to further improve resiliency and reduce cost.

sopel commented 10 years ago

This has been a topic of https://github.com/cityindex/logsearch-config/issues/56 - here are some challenges and paraphrased quotes from the discussion as a foundation for extracting further dedicated issues (please correct/amend as you see fit) :

Challenges

  1. @dpb587 mentions the challenge of handling fixed IP addresses with Auto Scaling (currently used/required for inbound log shipping, BTF access and in cluster communication)
  2. @sopel adds that the similar issue of handling fixed EBS volumes (currently used/required for Redis/Elasticsearch persistence)

Discussion

sopel commented 10 years ago

We can continue the generic discussion here, but I've extracted the various tiers to separate issues for separation of concerns:

sopel commented 10 years ago

Closed as Incomplete in favor of the extracted issues (see preceding comment).