Cloudwatch Alarms and Scaling Policies exposed to end user #35

Open mootpt opened 6 years ago

mootpt commented 6 years ago

I began creating some Alarms, scaling policies, and what have you to ensure CircleCI workers scaled based on the threshold end users provided, but later stumbled across:

It wasn't clear if CircleCI Enterprise is creating any Alarms under the hood or if it simply providing metrics in a particular namespace for monitoring health. It doesn't explicitly state that the alarms are being created, so I assume it's just the metrics (e.g ContainersAvailable).

All that said, it might be worth throwing some Alarms and scaling policies into the repo with a simple conditional for turning it on and off. Also, I would suggest exposing the threshold for said Alarms as a variable to the end user. Similarly this could be exposed for nomad cluster as well.

Something like:

resource "aws_cloudwatch_metric_alarm" "workers_out" {
  count               = "${var.enable_cw ? 1 : 0}"
  alarm_name          = "workers-scaling-out-alarm"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "ContainersAvailable"
  namespace           = "CircleCIEnterprise"
  period              = "300"
  statistic           = "Average"
  threshold           = "${var.worker_so_threshold}"

  dimensions {
    QueueName = "${}"

  alarm_description = "This metric monitors the number containers available on the workers"
  alarm_actions     = ["${aws_autoscaling_policy.workers_out.arn}"]

resource "aws_autoscaling_policy" "workers_out" {
  count                  = "${var.enable_cw ? 1 : 0}"
  name                   = "workers-scaling-policy"
  scaling_adjustment     = 1
  adjustment_type        = "ChangeInCapacity"
  cooldown               = 300
  autoscaling_group_name = "${}"
Twang130 commented 6 years ago

