Closed Ardiea closed 3 months ago
Our first attempt at autoscaling for edx-platform instances was based on metrics collected from the load balancer. This works in that the rules were triggered to launch new instances, but the ASG got into a state where the uptime was flapping due to fluctuations in the number of instances and their overall readiness.
@shaidar when you get a minute can you add any context about the failure mode that we ran into with MITx Online when the ASG scale-up policy triggered?
Closing as no longer relevant.
Revisit the Autoscaling for edx. Previously, when autoscaling was configured, edx would go into a death sprial of up/down/up/down over and over. Still need autoscaling but it needs to be smarter than the previous implementation.
Sar has load testing code somewhere that we can use to give this a nudge and hopefully trigger the autoscaling.