jackc / tod

Time of day and shift types for Ruby
MIT License
435 stars 56 forks source link

Slight inconsistency with overlaps? and exclusive end times #76

Closed jhwinters closed 3 years ago

jhwinters commented 3 years ago

I came across one apparent oddity in an edge case, illustrated by this code:

require 'tod'

all_day = Tod::Shift.new(Tod::TimeOfDay.new(9), Tod::TimeOfDay.new(17), true)

puts "In the middle"
instant = Tod::Shift.new(Tod::TimeOfDay.new(10), Tod::TimeOfDay.new(10), true)
puts all_day.overlaps?(instant)   # true
puts instant.overlaps?(all_day)   # true

puts "At the beginning"
instant = Tod::Shift.new(Tod::TimeOfDay.new(9), Tod::TimeOfDay.new(9), true)
puts all_day.overlaps?(instant)   # true
puts instant.overlaps?(all_day)   # false

puts "At the end"
instant = Tod::Shift.new(Tod::TimeOfDay.new(17), Tod::TimeOfDay.new(17), true)
puts all_day.overlaps?(instant)   # false
puts instant.overlaps?(all_day)   # false

All the calls produce the result which I would expect except the third one. Should that not produce false, if only for consistency?

jackc commented 3 years ago

Well, first of all I'm not sure if you mean the third overlaps? call or the third group of calls...

But beyond that I'm not even sure what the proper semantics of the instant Shift should even be. Since the end is exclusive it has zero length or size. Something with zero size can't really have a position.

If you want to test if a time of day is within a shift then you should use Shift::include? with a TimeOfDay argument instead of a zero length shift.

jhwinters commented 3 years ago

Thanks for the prompt response.

By the third call I mean exactly that - the third one - the one which produces "true" where it should produce "false", if only for consistency with the fourth one which is testing the same thing.

No, I'm not trying to test a time of day - I'm testing periods of the day, between two times, but unfortunately the people entering the periods can enter one of zero duration. The issue cropped up in actual testing. The code snippet above is not the original code but was written specifically to show the problem in the shortest possible space and in a reproducible form.

Testing for overlap of two time intervals is an interesting one to get your head around. (Dates are even more entertaining, because then it becomes absolutely vital to use exclusive end dates and so many people try to use inclusive ones, then layer frig upon frig upon frig to try to cope with that initial mistake.)

The best way I've found to think about it is to consider the opposite. When do two events not overlap? If one is before the other or the other is before the one. Thus given two Shifts - a and b - and using exclusive end times as in my example above, one can test for not overlapping with:

if (a.beginning >= b.ending) || (b.beginning >= a.ending)

and so the test for overlapping is the negative of that:

if !((a.beginning >= b.ending) || (b.beginning >= a.ending))

which by applying De Morgan's Law we can reduce to:

if (a.beginning < b.ending) && (b.beginning < a.ending)

and this has the advantage that it works correctly for all Shifts, even zero-length ones.

For Shifts using the default inclusive end times we need to start with:

if (a.beginning > b.ending) || (b.beginning > a.ending)

and thus end up with:

if (a.beginning <= b.ending) && (b.beginning <= a.ending)

The mathematician in me has to disagree with your assertion that something with zero size can't have a position. They do all the time. Similarly in this practical case, some events within our system have zero duration - for instance, "Deadline for submitting 3rd year reports".

As I said, it's very much an edge case and I can work around it if needs be.

Cheers,

jackc commented 3 years ago

The mathematician in me has to disagree with your assertion that something with zero size can't have a position. They do all the time.

Well, I'm no mathematician, so maybe that is so.

But perhaps it would be better said that I find it highly unintuitive to say that something with a 0 size can overlap something. Any overlap has size. e.g. [1,5) overlaps [3,7) for 2 units. But there would be 0 units of overlap between [1,5) and [2,2) How can something that has 0 overlap be considered to overlap?

Also precedent I've seen in other programming environments zero length ranges are special (or broken) in some manner.

For example in PostgreSQL:

jack@[local]:5432 jack=# select int4range(2,2), int4range(1,5), int4range(1,5) @> int4range(10,10);
 int4range │ int4range │ ?column?
───────────┼───────────┼──────────
 empty     │ [1,5)     │ t
(1 row)

Note the unexpected true for int4range(1,5) @> int4range(10,10).

But in Ruby itself zero length ranges appear to always be considered by Range#cover? as false.

[14] pry(main)> (1...10).cover?(2...3)
=> true
[15] pry(main)> (1...10).cover?(2...2)
=> false

As I said, it's very much an edge case and I can work around it if needs be.

I think a zero length Shift should currently be considered undefined behavior. I guess I'm not opposed to changing this behavior, but TBH my initial reaction would be that it should be disallowed rather than defining all the edge cases (I'm guessing there could be more oddities with 0 length shifts than this one).

jhwinters commented 3 years ago

Interestingly, I sub-classed Shift and did initially disallow zero-length instances in my sub-class. Then when I realised that for consistent processing they were needed I removed the check which prevented them and that's when I started seeing the issues.

Whilst it may be true that other implementations of similar things don't achieve consistent processing, in this particular instance it would be easy to make the processing consistent (it very nearly is already) so why not make it work cleanly?

You ask how [1,5) can be considered to overlap with [2,2). As we're dealing with times, let's try a real world example with times. My lesson runs from 1 until 5, and a bell will ring at 2 - will the bell ring during my lesson? Yes, it will. On the other hand, if the bell rings at 6 - [6,6) - then it will not ring during my lesson. [2,2) overlaps with [1,5) and [6,6) does not.

I suspect you're trying to overthink it by looking at other ranges. The processing which I suggested in my previous comment is short, clear, reliable and 100% consistent. It produces correct results regardless of whether Shifts are zero-length or not. Why not use it?

jackc commented 3 years ago

You ask how [1,5) can be considered to overlap with [2,2). As we're dealing with times, let's try a real world example with times. My lesson runs from 1 until 5, and a bell will ring at 2 - will the bell ring during my lesson? Yes, it will. On the other hand, if the bell rings at 6 - [6,6) - then it will not ring during my lesson. [2,2) overlaps with [1,5) and [6,6) does not.

I would see this as "how can a bell ring for zero amount of time"? You say it is true that the bell range at 2 for 0 amount of time. But wouldn't it also be true to say that the bell rang at 3 for 0 amount of time? Wouldn't it be true to say the bell rang for 0 amount of time at any or all times? To me a zero length range doesn't exist.

But this is getting pedantic.

If you want to change this behavior in a PR I'll merge it.

jhwinters commented 3 years ago

Hi there,

Just working on a shorter, simpler, faster version of Shift#overlaps?. I've added some extra tests for all the edge conditions and my new code passes all those tests, but fails two of the existing ones. On examining them though, I think those tests and the existing implementation are both wrong.

My implementation is just:

    def overlaps?(other)
      a, b = [self, other].map(&:range)
      aop = a.exclude_end? ? :> : :>=
      bop = b.exclude_end? ? :> : :>=
      a.last.send(aop, b.first) && b.last.send(bop, a.first)
    end

Which passes all the tests except a couple of the array of timings for testing shifts running into the next day, specifically:

[7,2,1,4]
[1,4,7,2]

The existing implementation of Shift#overlaps? reckons these overlap, but unless I've misunderstood how Shifts are meant to work I reckon they don't.

The first line gives us a Shift from 7am to 2am the following day (wrapping through midnight) and another from 1am to 4am. These don't overlap, unless you reckon that by calling Shift#overlaps? and passing in the second Shift it is implicitly moved to the next day.

The second line gives us a Shift from 1am to 4am and another from 7am to 2am the following day. These definitely don't overlap.

Am I missing something obvious?

rewritten commented 3 years ago

As a short-time user of the gem, and also former mathematician, I'd like to give my opinion on this debate.

TLDR: The behavior of Shift is somehow incorrect w.r.t overlaps

First of all, the postgres example is not exactly applicable, because it uses the @> operator (contains) while overlapping is &&.

So, the idea is that a shift is a contiguous subset of the "time modulo a day". Therefore, an empty shift (same start/end, excluding end) should always be "contained" in any other, but never "overlap" any other.

Also "overlapping" should be symmetric, a.overlaps?(b) if and only if b.overlaps(a).

Finally, the overlapping should take into account wrapping around, as in this case:

a = Tod::Shift.new("3:00".to_time.to_time_of_day, "4:00".to_time.to_time_of_day, true)
b = Tod::Shift.new("4:00".to_time.to_time_of_day, "3:00".to_time.to_time_of_day)
a.overlaps?(b) # false
b.overlaps?(a) # false

both calls should return true because 3:00 belongs to both shifts.

rewritten commented 3 years ago

I have to disagree about the fact that "1 AM -> 4 AM" and "7 AM -> 2 AM" do not overlap. They should definitely overlap, because there is no notion of "this day" or "another day", the shifts are (just like TimeOfDay instances) elements of the 24h-clock math of time around the day (which can be viewed as the integers modulo 86400)

jhwinters commented 3 years ago

Ah - thank you. Yes, that's a different way of thinking about it and by that reasoning those two do indeed overlap. I'm not sure that way of thinking quite matches the existing implementation though. I will work on it further.

So a Shift which passes through midnight should really be thought of as two Shifts in the same 24 hour period. A shift from 22 to 3 occupies both 22 to 24 and 00 to 03.

rewritten commented 3 years ago

The ideal implementation for overlaps? in my opinion would be

def overlaps?(other)
  [
    [self, other],
    [self.slide(TimeOfDay::NUM_SECONDS_IN_DAY), other],
    [self, other.slide(TimeOfDay::NUM_SECONDS_IN_DAY)]
  ].any? do |this, that|
    this.range.cover?(that.range.begin) || that.range.cover?(this.range.begin)
  end
end

maybe more explicit is better...


def overlaps?(other)
  range.cover?(other.range.begin) || 
  other.range.cover?(range.begin) ||
  slide(TimeOfDay::NUM_SECONDS_IN_DAY).range.cover?(other.range.begin) || 
  other.range.cover?(slide(TimeOfDay::NUM_SECONDS_IN_DAY).range.begin) ||
  range.cover?(other.slide(TimeOfDay::NUM_SECONDS_IN_DAY).range.begin) || 
  other.slide(TimeOfDay::NUM_SECONDS_IN_DAY).range.cover?(range.begin)
end
jhwinters commented 3 years ago

Ta - useful input. I just tried your implementation and it also fails the [7,2,1,4] test just like mine does. I will have to sit down and think about why.

rewritten commented 3 years ago

weird, it should be matched (returning true) by

self.range.cover?(other.slide(TimeOfDay::NUM_SECONDS_IN_DAY).range.begin)
# (25_200..93_600).cover?(90_000)
jhwinters commented 3 years ago

Well, yes - that was my impression too. I literally just cut and pasted the method into my copy of the code and then ran the tests. I will investigate.

jhwinters commented 3 years ago

This version passes all the tests. I've steered clear of using Range#cover? because it's the behaviour of that which seems to have caused the inconsistent behaviour in the first place. Not as short as I would have liked, but...

    def overlaps?(other)
      a, b = [self, other].map(&:range)
      #
      #  Although a Shift which passes through midnight is stored
      #  internally as lasting more than TimeOfDay::NUM_SECONDS_IN_DAY
      #  seconds from midnight, that's not how it is meant to be
      #  handled.  Rather, it consists of two chunks:
      #
      #    range.first => Midnight
      #    Midnight => range.last
      #
      #  The second one is *before* the first.  None of it is more than
      #  TimeOfDay::NUM_SECONDS_IN_DAY after midnight.  We thus need to shift
      #  each of our ranges to cover all overlapping possibilities.
      #
      ashifted =
        Range.new(
          a.first + TimeOfDay::NUM_SECONDS_IN_DAY,
          a.last + TimeOfDay::NUM_SECONDS_IN_DAY,
          a.exclude_end?
      )
      bshifted =
        Range.new(
          b.first + TimeOfDay::NUM_SECONDS_IN_DAY,
          b.last + TimeOfDay::NUM_SECONDS_IN_DAY,
          b.exclude_end?
      )
      #
      #  For exclusive ranges we need:
      #
      #  a.ending > b.beginning && b.ending > a.beginning
      #
      #  and for inclusive we need:
      #
      #  a.ending >= b.beginning && b.ending >= a.beginning
      #
      aop = a.exclude_end? ? :> : :>=
      bop = b.exclude_end? ? :> : :>=
      #
      (a.last.send(aop, b.first) && b.last.send(bop, a.first)) ||
      (ashifted.last.send(aop, b.first) && b.last.send(bop, ashifted.first)) ||
      (a.last.send(aop, bshifted.first) && bshifted.last.send(bop, a.first))
    end
jackc commented 3 years ago

I merged #77.