daypack-dev / timere

OCaml date time handling and reasoning suite
MIT License
68 stars 7 forks source link

feature used: comparing datetimes #28

Closed gasche closed 3 years ago

gasche commented 3 years ago

Note: a "feature used" issue is related to a "feature request', except that instead of reporting on a feature I desperately need, I am reporting on a feature I implemented (on the user side) in a small program I just finished converting to Timere (a seminar calendar generator). If you think the feature would be useful to more users, then please feel free to offer an implementation (better, I'm sure, and with a more robust API), but if you don't think it makes sense in the upstream project, please feel free to close the issue.

I have implemented a comparison function for Date_time.t, to be able to sort event in chronological (and anti-chronological) order.

let timere_compare_datetime (d1 : Timere.Date_time.t) d2 =
  let d1, d2 = Timere.Date_time.(to_timestamp_float_single d1, to_timestamp_float_single d2) in
  Float.compare d1 d2

I think that Span, Duration and Date_time could provide a compare : t -> t -> int function for user convenience -- some of them already provide equal I think? (Note: there is no total order between Timere.t values, so there I wouldn't try to provide one.)

Note: the implementation I used is suboptimal for ambiguous timestamps. Ambiguous timestamps might not have a total order in the general case, but there are still many cases where they are in fact totally ordered (all possible interpretations on one side are below all possible interpretations on the other), so we might want to do better than always failing on ambiguities. I'm not sure it's worth being perfectionist on this -- I definitely wasn't.

darrenldl commented 3 years ago

Okay yep I see the niceties it'd offer.

Span now does have a compare, and Duration is no longer around.

And yep indeed compare is well defined for Date_time.t for majority of cases - though what should happen if things should fail though?

In that direction of thinking, I'm tempted to have an alternative signature for Date_time.compare, namely t -> t -> int option, but I have no clue if this is better or worse than failing via exception.

gasche commented 3 years ago

Maybe you could have a compare_single function that fails with the "usual" exception if there is an ambiguity (sometimes it could still work if either of the datetimes are ambiguous), and a compare function with a more demanding type. (You could also have compare_min and compare_max but this is probably overkill.) I don't think that you can use local_result for the return type of comapre, so maybe option makes sense indeed.

Drup commented 3 years ago

You could also have compare_min and compare_max but this is probably overkill.

Actually, I think this is the better solution. It's always defined, the user pick one, and there are no surprises. I expect for almost all uses cases (except maybe making a Map, but then it doesn't matter), there will be an obvious good one.

darrenldl commented 3 years ago

Yep, as @Drup hinted (for the case of Map) the caveat would be if Timedesc.compare_min/max dt1 dt2 yields 0, it may not actually mean Timedesc.equal dt1 dt1.

OTOH, we can define compare using lexicographic order, since date time is meaningfully a triple of time zone * date * time, and we can fix an arbitrary ordering to time zones (say by names). This would yield closer relation to equal but less useful in practice.

Yeah idk - do we change the behaviour of Timedesc.equal such that it also only reasons at the level of "observable timestamps", or just make it explicit that Timedesc.compare works differently?

gasche commented 3 years ago

For me, if a datatype is meant to represent a point in time (or a set of points in time), then the equality and comparison operators should reason on points in time, not on the (fancy word: intentional) details of how this point of time was described. There is of course room for a variety of operators and their nuances, but the one I actually have a use for is "sort these times by (anti)chronological order please".

For correctness I think that it would make sense to adopt the same approach as for other operations: have a version that assumes no ambiguity issues and raises otherwise, and a function that returns a richer datatype (possibly option if the ambiguous space is too hard to describe). (One subtlety being that there may be no ambiguity issue even in situations where one or both of the dates are ambiguous.)

darrenldl commented 3 years ago

For me, if a datatype is meant to represent a point in time (or a set of points in time), then the equality and comparison operators should reason on points in time, not on the (fancy word: intentional) details of how this point of time was described. There is of course room for a variety of operators and their nuances, but the one I actually have a use for is "sort these times by (anti)chronological order please".

I think that is reasonable, just I don't want to give the impression that compare dt1 dt2 = 0 means dt1 may be replaced with dt2 in any context whatsoever. I am not sure what's a good way to encode this intention however.

darrenldl commented 3 years ago

How does having val compare_chrono : t -> t -> int option (and compare_chrono_max/min) and val compare_struct : t -> t -> int sound?

Former is the one more useful for ordering, but compare_chrono x y = 0 yields no meaning in terms of structural equality, latter is useful for creating maps, and compare_struct x y = 0 exactly when equal x y.

darrenldl commented 3 years ago

Added compare_chrono_min/max and compare_struct at a31a67f0d217065ee4067f03f357238934a7d9d1

Not adding compare_chrono : t -> t -> int option since it seems awkward to use in practice.