agronholm / apscheduler

Task scheduling library for Python
MIT License
6.22k stars 705 forks source link

Cron trigger support for step values greater than the maximum #509

Open mdz opened 3 years ago

mdz commented 3 years ago

Is your feature request related to a problem? Please describe. I recently migrated a workload from Vixie cron to APScheduler, and discovered that the cron trigger only supports a subset of the Vixie crontab format.

Specifically, we had a couple of jobs configured with */60 * * * * to run every 60 minutes. I assume this is equivalent to 0 * * * *. cron accepted this, but APScheduler rejects it: https://github.com/agronholm/apscheduler/blob/d10f20215d8c78e9e2d32d634f276bb89f86ca38/apscheduler/triggers/cron/expressions.py#L34

That particular example isn't that useful, but it turns out that things like */90 * * * * are also allowed.

Describe the solution you'd like It would be nice to support this in APScheduler for greater compatibility with cron.

rafalkrupinski commented 1 year ago

I don't know about Vixie, but AFAIK /N actually means values divisible by N. since there are no minutes in any hour divisible by 90, there's no need for such pattern.

mdz commented 1 year ago

*/90 runs the job every 90 minutes which is useful.

rafalkrupinski commented 1 year ago

OK, it's not mentioned in the vixie crontab man page, and both wikipidia and chatgpt say otherwise, but they are both often wrong.

Since aps is stateful, it makes perfect sense to support it.

mdz commented 1 year ago

Indeed. I discovered this behavior when I migrated a production workload from vixie cron to apscheduler, keeping the crontab entries intact. One of them failed because it relied on this (apparently undocumented) behavior

agronholm commented 1 year ago

This is not trivial to implement, but if somebody comes up with a PR that includes updated tests and documentation, I'm willing to consider it.

rafalkrupinski commented 1 year ago

This is not trivial to implement

How about something similar to AllTrigger from 3.x, substituting fields for combined triggers:

min_time = last_fire_time+1 second # the minimum resolution
for _ in range(max_iterations):
    next_times = [field.next_value(min_time) for field in self.fields]
    max_time = max(next_times)
    if min(next_times) == max_time:
        min_time = max_time
        break
else:
    return None
return max_time

Every field returns min_value if it matches its expression, or the next smallest time that does. Since every field returns a complete datetime, it can jump an arbitrary number of units and it will pull all the other fields in the next iteration. Unless you set contradictory fields (e.g. month=2, day=30) it will always end up with time in a finite number of iterations.

For expression 1 ... the first iteration will result with min_time, min_time, datetime($year,$month,1) second iteration will result with the right time.

For expression with day_of_month and day_of_week, it will iterate over days until it finds the right date. Similarly for an expression with any field over its range (seconds=90) and the more significant set to any value other than * (here minute) it will iterate until it finds the right combination of fields values.

~I guess for expressions with only * or step values and no DST, you could use LCM~