cespare / cron

A Go implementation of the cron scheduling format
MIT License
4 stars 1 forks source link

Support Jenkins-style H (hash) schedules #1

Closed cespare closed 1 year ago

cespare commented 1 year ago

We should support Jenkins-style cron expressions where H may be given instead of a minute, hour, DoM, month, or DoW. H ("hash") means "pick a random value". But it's randomized according to a fixed seed provided by the user, so that a particular kind of job can consistently use the same value. (For instance the seed might be a job name or a database ID.)

Here is the full Jenkins documentation for its own cron syntax:

Expand ``` This field follows the syntax of cron (with minor differences). Specifically, each line consists of 5 fields separated by TAB or whitespace: MINUTE HOUR DOM MONTH DOW MINUTE Minutes within the hour (0–59) HOUR The hour of the day (0–23) DOM The day of the month (1–31) MONTH The month (1–12) DOW The day of the week (0–7) where 0 and 7 are Sunday. To specify multiple values for one field, the following operators are available. In the order of precedence, * specifies all valid values M-N specifies a range of values M-N/X or */X steps by intervals of X through the specified range or whole valid range A,B,...,Z enumerates multiple values To allow periodically scheduled tasks to produce even load on the system, the symbol H (for “hash”) should be used wherever possible. For example, using 0 0 * * * for a dozen daily jobs will cause a large spike at midnight. In contrast, using H H * * * would still execute each job once a day, but not all at the same time, better using limited resources. The H symbol can be used with a range. For example, H H(0-7) * * * means some time between 12:00 AM (midnight) to 7:59 AM. You can also use step intervals with H, with or without ranges. The H symbol can be thought of as a random value over a range, but it actually is a hash of the job name, not a random function, so that the value remains stable for any given project. Beware that for the day of month field, short cycles such as */3 or H/3 will not work consistently near the end of most months, due to variable month lengths. For example, */3 will run on the 1st, 4th, …31st days of a long month, then again the next day of the next month. Hashes are always chosen in the 1-28 range, so H/3 will produce a gap between runs of between 3 and 6 days at the end of a month. (Longer cycles will also have inconsistent lengths but the effect may be relatively less noticeable.) Empty lines and lines that start with # will be ignored as comments. In addition, @yearly, @annually, @monthly, @weekly, @daily, @midnight, and @hourly are supported as convenient aliases. These use the hash system for automatic balancing. For example, @hourly is the same as H * * * * and could mean at any time during the hour. @midnight actually means some time between 12:00 AM and 2:59 AM. Examples: # every fifteen minutes (perhaps at :07, :22, :37, :52) H/15 * * * * # every ten minutes in the first half of every hour (three times, perhaps at :04, :14, :24) H(0-29)/10 * * * * # once every two hours at 45 minutes past the hour starting at 9:45 AM and finishing at 3:45 PM every weekday. 45 9-16/2 * * 1-5 # once in every two hours slot between 9 AM and 5 PM every weekday (perhaps at 10:38 AM, 12:38 PM, 2:38 PM, 4:38 PM) H H(9-16)/2 * * 1-5 # once a day on the 1st and 15th of every month except December H H 1,15 1-11 * ```

To be concrete, here is the proposed new API.

// ParseWithHash is like Parse but additionally supports the symbol H in place
// of the minute, hour, day of month, month, or day of week field. The H symbol
// requests a random value (within the valid range) for each instance of H in
// the cron expression fixed using the given seed.
//
// For example, the schedule
//
//  H H * * *
//
// is a schedule that fires once per day at a random hour and minute that is
// chosen when the schedule is parsed. Given the same input expression and seed,
// the same schedule is generated.
//
// The range for randomly generated day of month values is [1, 28].
//
// Additionally, ParseWithHash interprets the named schedules differently from
// Parse:
//
//   - "@monthly" means "H H H * *"
//   - "@weekly" means "H H * * H"
//   - "@daily" means "H H * * *"
//   - "@hourly" means "H * * * *"
//
// The idea of the H symbol is borrowed from Jenkins, though the details are a
// bit different.
func ParseWithHash(expr string, seed uint64) (*Schedule, error)

Assorted notes:

PleasingFungus commented 1 year ago

This SGTM. The changes to my proposal - make Parse reject H schedules with a useful message and make ParseWithHash take a uint64 instead of a []byte - are both improvements. (My choice of []byte was the part I was most uncertain about in the original proposal.)