floraison / fugit

time tools (cron, parsing, durations, ...) for Ruby, rufus-scheduler, and flor
MIT License
353 stars 29 forks source link

Cron inconsistent behaviour with union vs intersection #98

Open trafium opened 3 months ago

trafium commented 3 months ago

Issue description

I am trying to figure out how Fugit handles day-of-month and day-of-week union vs intersection problem and if possible reintroduce infamous bug/feature (https://crontab.guru/cron-bug.html), where it is intersection only if first symbol of either of those fields is *.

As I see it right now Fugit does not have that bug, but does not fully adhere to either interpretation of man 5 page I can come up with:

require 'fugit'

def print_occurrences(cron_expression, since, number_or_occurrences)
  puts "---"
  puts cron_expression
  cron = Fugit.parse_cron(cron_expression)
  last_since = since

  occurrences = Array.new(number_or_occurrences).map do
    last_since = cron.next_time(last_since)
    last_since
  end

  puts occurrences.map { |e| e.utc.to_s }.join(", ")
end

print_occurrences "0 12 * * SUN UTC", "2024-01-01 00:00:00", 3
# => 2024-01-07 12:00:00 UTC, 2024-01-14 12:00:00 UTC, 2024-01-21 12:00:00 UTC
# ✅ Results in intersection, which is both expected and matches Vixie cron.

print_occurrences "0 12 *,* * SUN UTC", "2024-01-01 00:00:00", 3
# => 2024-01-07 12:00:00 UTC, 2024-01-14 12:00:00 UTC, 2024-01-21 12:00:00 UTC
# ✅ Results in intersection, which is both expected and matches Vixie cron.

print_occurrences "0 12 *,10 * SUN UTC", "2024-01-01 00:00:00", 3
# => 2024-01-07 12:00:00 UTC, 2024-01-14 12:00:00 UTC, 2024-01-21 12:00:00 UTC
# ✅ Results in intersection, which is both expected and matches Vixie cron.

print_occurrences "0 12 10,* * SUN UTC", "2024-01-01 00:00:00", 3
# => 2024-01-07 12:00:00 UTC, 2024-01-14 12:00:00 UTC, 2024-01-21 12:00:00 UTC
# ❌ Results in intersection, where Vixie cron would use union.
# My assumption(1) is man 5 page note is interpreted as "as long as one of two fields include `*` somewhere, it's an intersection".
# Sadly I see no way to force union here though.

print_occurrences "0 12 10 * *,* UTC", "2024-01-01 00:00:00", 3
# => 2024-01-10 12:00:00 UTC, 2024-02-10 12:00:00 UTC, 2024-03-10 12:00:00 UTC
# ✅ Results in intersection, which is both expected and matches Vixie cron

print_occurrences "0 12 */3 * SUN UTC", "2024-01-01 00:00:00", 3
# => 2024-01-01 12:00:00 UTC, 2024-01-04 12:00:00 UTC, 2024-01-07 12:00:00 UTC
# ❌ Results in union, which is unexpected considering assumption from `0 12 10,* * SUN` case.
# My new assumption(2) is that step values from `*` (i.e. */2) do not contribute towards intersection.
# Also this differs from Vixie cron, but I can force intersection via `&`.

print_occurrences "0 12 10 * *,2 UTC", "2024-01-01 00:00:00", 3
# => 2024-01-02 12:00:00 UTC, 2024-01-09 12:00:00 UTC, 2024-01-10 12:00:00 UTC
# ❌❌ Results in union with `2` in day-of-week, ignoring `*`. This is unexpected and at this point it seems that day-of-week field contributes differently.
# I believe this to be a bug.

print_occurrences "0 12 10 * 2,* UTC", "2024-01-01 00:00:00", 3
# => 2024-01-02 12:00:00 UTC, 2024-01-09 12:00:00 UTC, 2024-01-10 12:00:00 UTC
# ❌❌ Results in incorrect union. Same situation as above, same conclusions.

print_occurrences "0 12 10 * */2 UTC", "2024-01-01 00:00:00", 3
# => 2024-01-02 12:00:00 UTC, 2024-01-04 12:00:00 UTC, 2024-01-06 12:00:00 UTC
# ❌ Results in union. Consistent with assumption(2), but given different behaviour when using complex values with `*` in day-of-week field,
# it might be that day-of-week behaviour is correct while day-of-month is not.

Is my understanding of this inconsistency correct? Which parts of this are expected behaviour? Would you consider having support for "force union" character, like & is for "force intersection"?

Context

Darwin Dmitris-MacBook-Air.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:27 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8103 arm64
ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [arm64-darwin23]
[:env_tz, nil]
(secs:1711380674.477833,utc~:"2024-03-25 15:31:14.4778330326080322",ltz~:"EET")
(etz:nil,tnz:"EET",tziv:"2.0.6",tzidv:"1.2024.1",rv:"3.2.3",rp:"arm64-darwin23",win:false,rorv:nil,astz:nil,eov:"1.2.11",eotnz:#<TZInfo::TimezoneProxy: Africa/Cairo>,eotnfz:"+0200",eotlzn:"Africa/Cairo",eotnfZ:"EET",debian:nil,centos:nil,osx:"zoneinfo/Europe/Tallinn")
[:fugit, "1.10.1"]
[:now, 2024-03-25 17:31:15.191495 +0200, :zone, "EET"]
jmettraux commented 3 months ago

Hello,

Is my understanding of this inconsistency correct?

Yes.

Which parts of this are expected behaviour?

All of them, I limited myself to my interpretation of man 5 crontab on OpenBSD plus some extras.

Would you consider having support for "force union" character, like & is for "force intersection"?

Yes, but my priority queue is: day job, family, #97 and then #80, which is somehow linked to this #98 here.

Thanks for the detailed report.