Behaviour change between 1.4.4 and 1.4.5

yob commented 1 year ago

Issue description

We're working on upgrading our system from fugit 1.3.2 to 1.6.0. While doing so, we noticed the hash+modulo extension changed behaviour in the 1.4.5 release:

## fugit 1.4.4

[2] pry(main)> time = Time.utc(2022,9,12,12,1,0)
=> 2022-09-12 12:01:00 UTC
[3] pry(main)> next_time = Fugit::Cron.do_parse("30 14 * * 4%4+3 Australia/Melbourne").next_time(time).utc
=> 2022-09-22 04:30:00 UTC

## fugit 1.4.5

[1] pry(main)> time = Time.utc(2022,9,12,12,1,0)
=> 2022-09-12 12:01:00 UTC
[2] pry(main)> next_time = Fugit::Cron.do_parse("30 14 * * 4%4+3 Australia/Melbourne").next_time(time).utc
=> 2022-10-06 03:30:00 UTC

Interestingly, the only change in that release was #47, which was reported by a colleague of mine - @ticky. We're only now getting around to upgrading and bringing in all the latest fixes.

Error and error backtrace (if any)

None

Expected behaviour

Based on the change it sounds like the intent was to support an additional edge case, so I was surprised to see a behaviour change.

I'm still wrapping my head around this syntax, so maybe 4%4+3 doesn't make much sense and changing the behaviour was inevitable. That's fine if so, we'll plan a careful transition with the impacted users. I wanted to check first though.

Context

Please replace the content of this section with the output of the following commands:


Linux 46f24ef4fb1b 5.19.0-1-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.19.6-1 (2022-09-01) x86_64 GNU/Linux
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-linux]
[:env_tz, nil]
(secs:1663075650.3111818,utc~:"2022-09-13 13:27:30.31118178367614746",ltz~:"UTC")
(etz:nil,tnz:"UTC",tziv:"2.0.5",tzidv:nil,rv:"2.7.6",rp:"x86_64-linux",win:false,rorv:nil,astz:nil,eov:"1.2.7",eotnz:#<TZInfo::DataTimezone: Etc/UTC>,eotnfz:"+0000",eotlzn:"Etc/UTC",eotnfZ:"UTC",debian:"Etc/UTC",centos:nil,osx:"Etc/UTC")
[:fugit, "1.6.0"]
[:now, 2022-09-13 13:27:31.654584293 +0000, :zone, "UTC"]```

jmettraux commented 1 year ago

Hello,

thanks for reporting that.

Double-checking for myself.

irb> `uname -a`
=> "OpenBSD somalia 7.1 GENERIC.MP#3 amd64\n"
irb> RUBY_VERSION
=> "2.7.6"
irb> Fugit::Cron.do_parse('30 14 * * 4%4+3 Australia/Melbourne').next_time(t).utc
=> 2022-10-06 03:30:00 UTC
irb> ENV['TZ'] = 'Australia/Melbourne'
=> "Australia/Melbourne"
irb> Fugit::Cron.do_parse('30 14 * * 4%4+3 Australia/Melbourne').next_time(t).to_t
=> 2022-10-06 14:30:00 +1100
irb> Fugit::Cron.do_parse('30 14 * * 4%4+3 Australia/Melbourne').next_time(t).utc
=> 2022-10-06 03:30:00 UTC

Timezone transition, according to https://www.timeanddate.com/time/zone/australia/melbourne is on Sunday, 2nd of October at 2am.

irb> Fugit.parse('2022-09-22 04:00:00 UTC').rweek
=> 195
irb> Fugit.parse('2022-09-22 04:00:00 UTC').rweek % 4
=> 3
irb> (Fugit.parse('2022-09-22 04:00:00 UTC').rweek + 3) % 4
=> 2
irb> Fugit.parse('2022-10-06 04:00:00 UTC').rweek % 4
=> 1
irb> (Fugit.parse('2022-10-06 04:00:00 UTC').rweek + 3) % 4
=> 0

OK, I need to get this calculation right. Stay tuned.

Meanwhile, wouldn't something like "30 14 * * thu#last Australia/Melbourne" be easier to grok, although not strictly equivalent? https://github.com/floraison/fugit#the-hash-extension

Kind regards.

yob commented 1 year ago

Thanks for taking a look.

Meanwhile, wouldn't something like "30 14 thu#last Australia/Melbourne" be easier to grok, although not strictly equivalent?

Yer, I assume that might work pretty well. In our case, for better or worse we're using fugit to parse cronlines for many thousands of customer controlled schedules so we don't have direct control over the format people choose.

The good news is only 0.4% of cronlines from our data set changed behaviour between 1.3.2 and 1.6.0, and nearly all of them use X%4+Y. They're %4 specifically too, cronlines with %1, %2 and %3 don't appear in the list of cronlines that changed.

The other common input that changed is a trailing comma, like 5,15,25,35,45,55, * * * *. These used to parse and now they raise an error, but we can clean these particular issue up in the dataset prior to upgrading so it's less challenging to handle.

jmettraux commented 1 year ago


def show(mod0, mod1)

  puts "%#{mod0}+#{mod1}"
  (mod0 + 1).times do |i|
    print "%3d:  " % [
      i ]
    print "%d %% %d --> %d" % [
      i, mod0, i % mod0 ]
    print "  |  %d %% %d == %d --> %5s" % [
      i, mod0, i % mod0, (i % mod0) == mod1 ]
    print "  |  (%d + %d) %% %d == 0 --> %5s" % [
      i, mod0, mod1, (i + mod1) % mod0 == 0 ]
    print "  |  %d %% %d == %d %% %d --> %5s" % [
      i, mod0, mod1, mod0, i % mod0 == mod1 % mod0 ]
    puts
  end
end

puts; show(2, 1)
puts; show(3, 2)
puts; show(2, 2)
puts; show(2, 4)
puts; show(4, 3)

Gives


%2+1
  0:  0 % 2 --> 0  |  0 % 2 == 0 --> false  |  (0 + 2) % 1 == 0 --> false  |  0 % 2 == 1 % 2 --> false
  1:  1 % 2 --> 1  |  1 % 2 == 1 -->  true  |  (1 + 2) % 1 == 0 -->  true  |  1 % 2 == 1 % 2 -->  true
  2:  2 % 2 --> 0  |  2 % 2 == 0 --> false  |  (2 + 2) % 1 == 0 --> false  |  2 % 2 == 1 % 2 --> false

%3+2
  0:  0 % 3 --> 0  |  0 % 3 == 0 --> false  |  (0 + 3) % 2 == 0 --> false  |  0 % 3 == 2 % 3 --> false
  1:  1 % 3 --> 1  |  1 % 3 == 1 --> false  |  (1 + 3) % 2 == 0 -->  true  |  1 % 3 == 2 % 3 --> false
  2:  2 % 3 --> 2  |  2 % 3 == 2 -->  true  |  (2 + 3) % 2 == 0 --> false  |  2 % 3 == 2 % 3 -->  true
  3:  3 % 3 --> 0  |  3 % 3 == 0 --> false  |  (3 + 3) % 2 == 0 --> false  |  3 % 3 == 2 % 3 --> false

%2+2
  0:  0 % 2 --> 0  |  0 % 2 == 0 --> false  |  (0 + 2) % 2 == 0 -->  true  |  0 % 2 == 2 % 2 -->  true
  1:  1 % 2 --> 1  |  1 % 2 == 1 --> false  |  (1 + 2) % 2 == 0 --> false  |  1 % 2 == 2 % 2 --> false
  2:  2 % 2 --> 0  |  2 % 2 == 0 --> false  |  (2 + 2) % 2 == 0 -->  true  |  2 % 2 == 2 % 2 -->  true

%2+4
  0:  0 % 2 --> 0  |  0 % 2 == 0 --> false  |  (0 + 2) % 4 == 0 -->  true  |  0 % 2 == 4 % 2 -->  true
  1:  1 % 2 --> 1  |  1 % 2 == 1 --> false  |  (1 + 2) % 4 == 0 --> false  |  1 % 2 == 4 % 2 --> false
  2:  2 % 2 --> 0  |  2 % 2 == 0 --> false  |  (2 + 2) % 4 == 0 -->  true  |  2 % 2 == 4 % 2 -->  true

%4+3
  0:  0 % 4 --> 0  |  0 % 4 == 0 --> false  |  (0 + 4) % 3 == 0 --> false  |  0 % 4 == 3 % 4 --> false
  1:  1 % 4 --> 1  |  1 % 4 == 1 --> false  |  (1 + 4) % 3 == 0 -->  true  |  1 % 4 == 3 % 4 --> false
  2:  2 % 4 --> 2  |  2 % 4 == 2 --> false  |  (2 + 4) % 3 == 0 --> false  |  2 % 4 == 3 % 4 --> false
  3:  3 % 4 --> 3  |  3 % 4 == 3 -->  true  |  (3 + 4) % 3 == 0 --> false  |  3 % 4 == 3 % 4 -->  true
  4:  4 % 4 --> 0  |  4 % 4 == 0 --> false  |  (4 + 4) % 3 == 0 --> false  |  4 % 4 == 3 % 4 --> false

gh-47 goes from

    def weekday_modulo_match?(nt, mod)
      nt.rweek % mod[0] == mod[1]
    end

to

    def weekday_modulo_match?(nt, mod)
      (nt.rweek + mod[1]) % mod[0] == 0
    end

I want the results for pre gh-47, but the safety of gh-47. I think I will go with

    def weekday_modulo_match?(nt, mod)
      nt.rweek % mod[0] == mod[1] % mod[0]
    end

Which is demonstrated above by the exploration code.

jmettraux commented 1 year ago

@yob unless you see a show stopper, I'll close and release 1.7.0 soon. Thanks!

yob commented 1 year ago

Brilliant, thanks for the fast turnaround ❤️

I re-ran the current HEAD of main against our dataset and we're down to only 9 cronlines (out of many thousands) that changed behavior between 1.3.2 and 1.7.0 (pre release). All 9 look like bug fixes or intentional behavior changes, and definitely aren't related to this issue. Good to release I think!

For completeness, here's the 9 that changed behaviour. All tests were done with TZ=UTC, and input time of Time.utc(2022,9,12,12,1,0)

Behaviour changes in 1.3.9, due to the "New York skip" bug fix, gh-43

1 8,12,15 * * 1-5 America/New_York
1 4,12,20 * * * US/Pacific

Behaviour changes in 1.4.3, due to "Fix entering DST issue", gh-53

0 0 3 * * Australia/Melbourne

Behaviour changes in 1.5.0, due to gh-56

0 0/4 * * *
0/30 * * * *
0/10 * * * 1-5 America/New_York
0 4/12 * * *
0 0/12 * * * -09:00
30 7/4 * * *

jmettraux commented 1 year ago

Hello,

thanks for the detailed report. I am sticking with the changes in gh-43, gh-53, and gh-56. In my opinion, they make fugit "righter".

I've just released fugit 1.7.0, see https://rubygems.org/gems/fugit/versions/1.7.0

If there is anything wrong, please tell me.

Thanks again!

yob commented 1 year ago

In my opinion, they make fugit "righter".

I agree.

I've just released fugit 1.7.0

Thanks!

jmettraux commented 1 year ago

Hello again,

I had a second look at those cron strings linked to gh-56 via gh-79.

This will be the new behaviour:

0 0/4 * * * => 0 */4 * * *
0/30 * * * * => 0-59/30 * * * *
0/10 * * * 1-5 America/New_York => 0-59/10 * * * 1-5 America/New_York
0 4/12 * * * => 0 4-23/12 * * *
0 0/12 * * * -09:00 => 0 0-23/12 * * * -09:00
30 7/4 * * * => 30 7-23/4 * * *

I will release fugit 1.7.1 with this change very soon.

Thanks again!

floraison / fugit