hashicorp / nomad-autoscaler

Nomad Autoscaler brings autoscaling to your Nomad workloads.
Mozilla Public License 2.0
427 stars 83 forks source link

Schedule-based autoscaling #448

Open lgfa29 opened 3 years ago

lgfa29 commented 3 years ago

Autoscaling is usually a response action to some observed change in workload. But in some scenarios, the workload change has a well-defined and predictable periodicity. For these types of load, being able to preemptively schedule changes would be very useful.

The schedule-based autoscaling feature will allow operators to control a time window for when policies or individual checks are enabled or disabled. Policies are still evaluated in the interval defined by evaluation_interval attributed, but when the evaluation falls outside this time window, the policy or check will have no effect.

This will be done using a new block called enabled_schedule that can be placed inside a policy or check block. This new block will take a start cron expression that defines when the enabled time window starts. To define the end limit of the window, either a end cron expression or a duration string formatted as a Go duration can be passed.

The following examples define the same time window for when this policy is enabled: Mondays through Fridays from midnight to 11:59PM.

enabled_schedule {
  start = "0 0 * * 1"
  end   = "59 23 * * 5"
}
enabled_schedule {
  start    = "0 0 * * 1-5"
  duration = "24h"
}

NMD-104 Dynamically enable_disable a Nomad Autoscaler policy on schedule

This approach allows operators to use the strategy that best fits their use-case. A policy inside a job file would look like as follows:

job "bank" {
  ...
  group "stocks" {
    ...
    scaling {
      ...
      policy {
        # The policy is evaluated every 5s.
        evaluation_interval = "5s"

        # But it will only have any effect on weekdays.
        enabled_schedule {
          start = "0 0 * * 1"
          end   = "59 23 * * 5"
        }

        # Scale to 10 instances from 9:00AM to 4:30PM.
        check "market_hours" {
          enabled_schedule {
            start = "0 9 * * *"
            end   = "30 16 * * *"
          }

          strategy "fixed-value" {
            count = 10
          }
        }

        # Scale to 3 instances from 4:30PM to 9:00AM.
        check "off_market_hours" {
          enabled_schedule {
            start = "30 16 * * *"
            end   = "0 9 * * *"
          }

          strategy "fixed-value" {
            count = 3
          }
        }

        # Scale to 5 instances at midnight for 4h to handle 
        # account reconciliation workload.
        # This check schedule overlaps with the previous check,
        # but it will take precedence since it adds more 
        # instances, and so it's the safest choice.
        check "account_reconcilliation" {
          enabled_schedule {
            start      = "0 0 * * *"
            duration   = "4h"
          }

          strategy "fixed-value" {
            count = 5
          }
        }

        # Make sure we meet our weekdays SLA.
        check "response_time" {
          source = "prometheus"
          query  = "p99(transaction_time_ms)"

          enabled_schedule {
            start = "0 0 * * 1"
            end   = "59 23 * * 5"
          }

          strategy "target-value" {
            target = 700
          }
        }
      }
    }
  }
}
neomantra commented 3 years ago

You had asked for feedback in #61 . I've been using the community cron plugin for a day for a use-case similar to your example above. It works well for me so far (after some finagling since I operate nomad-autoscaler as a Docker image).

While that plugin's cron notation is very simple: period_business = "* * 9-17 * * mon-fri * -> 5", I prefer to think in terms of the start and end/duration you present here. Like it is hard to use that line to express 9:00:00 to 16:30:00.

Adding clock scheduling as a first-class enable_schedule feature is much more powerful. As you show, you can then apply different kinds of policies, rather than just modifying a count.

One difference in our use case, relative to your example, is that we start/stop the services daily. That would be more difficult to express if we only had start/end intervals (we would need to configure multiple intervals), but start+duration fixes that. Outside that interval, I want the count to be 0, which I think would happen if I configure the group's count = 0?

I note that hashicorp/cronexpr has this warning: As of now, the behavior of the code is undetermined if a malformed cron expression is supplied So if this were to be a production plugin, one would expect either that repo to fix that issue or the plugin to validate the cron entries.

When documenting this, be sure to indicate how timezones work in the cron expression... is it UTC or local? System cron works on localtime. Maybe requiring UTC expression simplifies working with regions and datacenters.

lgfa29 commented 3 years ago

Thank you for the feedback @neomantra, this is very helpful. And very good point about the timezones, in general I would say they always be UTC based.

One difference in our use case, relative to your example, is that we start/stop the services daily. That would be more difficult to express if we only had start/end intervals (we would need to configure multiple intervals)

I think this would be covered by these two blocks from the example?

# Scale to 10 instances from 9:00AM to 4:30PM.
check "market_hours" {
  enabled_schedule {
    start = "0 9 * * *"
    end   = "30 16 * * *"
  }

  strategy "fixed-value" {
    count = 10
  }
}

# Scale to 3 instances from 4:30PM to 9:00AM.
check "off_market_hours" {
  enabled_schedule {
    start = "30 16 * * *"
    end   = "0 9 * * *"
  }

  strategy "fixed-value" {
    count = 3
  }
}

Except that you would have count = 0 as you mentioned (though scaling to 0 may cause Nomad to consider the job as dead and cuase some issues, but I would need to double check).

neomantra commented 3 years ago

@lgfa29 I see now, I'm not sure what I was thinking then.

How would conflict resolution work? An obvious one would be overlapping start/ends with fixed-value: count=3 and the other with fixed-value: count=5... but it gets more complex with the general strategy stanza. Would the Nomad scheduler just complain and the operator would have to resolve it somehow?

It would be hard to confirm the impact in a pre-flight check, besides the naive "no overlapping start/stop schedules"; maybe do that and have some override for the advanced uses.

lgfa29 commented 3 years ago

How would conflict resolution work?

Conflict resolution would be handled the same way it's currently done, with the safest check being picked. From our docs:

The checks are executed at the same time during a policy evaluation and the results can conflict with each other. In a scenario like this, the autoscaler iterates the results the chooses the safest result which results in retaining the most capacity of the resource.

patademahesh commented 2 years ago

Is this feature implemented or do we have to use the community plugin for this?

schematis commented 2 years ago

Would love to see this as native functionality as well. We currently scale down to 0 at the end of each business day and then back up in the morning, but the autoscaler prevents this by scaling this back up to the target level every time the autoscaling group tries to scale down per it's schedule.

lgfa29 commented 1 year ago

Hi @patademahesh πŸ‘‹

No, this feature has not been implemented yet. I have not used the plugin, so I can't provide any guidance there, but you should give it a try and see if meets your needs. We initially tried to implement this as a plugin but we found some use cases that would be possible to handle, so this would require some changes in the Autoscaler core.

@schematis that's the main use case we envision for this, but we haven't had the chance to properly roadmap it yet. If you haven't already, don't forget to add a πŸ‘ to the original message so we can properly gauge interest πŸ™‚

caiodelgadonew commented 9 months ago

Hello @lgfa29 , maybe a good thing to put in this issue is also to be able to restart a job in a specific time.

Eg.:

In trading we need an app (raw_exec) to be restarted precisely at (as example) 09:58 am. Right now we're doing it with an in-house cron-like scheduler, but if we use this autoscaling feature described in this issue would suggest that we need to have a scale down to 0 at 9:57 and scale up to 1 at 9:58, which would make a minute without the app running., which is not good.

Something like:

# Restart app at 9:58 every business day 
schedules  {
  restart  = "58 9 * * 1-5" # β€œAt 09:58 on every day-of-week from Monday through Friday.”
}
lgfa29 commented 9 months ago

Hi @caiodelgadonew πŸ‘‹

I think job restart falls outside the scope for the autoscaler. I would also be worried about using the autoscaler, an eventually consistent system, to perform time sensitive tasks. Even with this proposal we can't really guarantee that your policy will be executed exactly at 9:57 since policies are evaluated in an interval (grey arrows in the diagram) that may not be aligned with the specific time you need.

For scheduled one-shot operations a periodic job may be more appropriate.

caiodelgadonew commented 9 months ago

I agree with you, maybe not a topic for here but something missing on the ui is the possibility to "restart all job allocs" something that can be done by the cli but in the ui we need to go through all allocs and restart them one by one, will check later if there's that in the API

lgfa29 commented 9 months ago

There's no API for that yet because async coordinated alloc restart are very tricky to implement, that's why the nomad job restart command implements this logic client-side (from your terminal). The PR covers a little bit about this: https://github.com/hashicorp/nomad/pull/16278.

If all you need is a simple loop of restarts you can use the /v1/job/:job_id/allocations and then call /v1/client/allocation/:alloc_id/restart on each of them.