Closed MetRonnie closed 4 years ago
(The tests fail because of things like TimeZone inheriting from Duration. I'm checking the Duration tests pass locally as I go along)
Given that we think P365D == P1Y
should be False
, what about P365D > P1Y
or P365D < P1Y
?
raise ValueError()
?
What about P1Y == P1Y
? I think that should be True
... but my brain hurts
my brain hurts
Why are dates so a hard!
I think it's ok if these comparison methods come with the caveat that they operate under certain assumptions. Accurate comparisons require sensical entities to compare. I would be happy to extend that to __eq__
, it's just that __hash__
gets in the way!
How about just changing the rule for equality and leaving the rest as it is (with a warning) since there is no correct solution:
P365D != P1Y
P1Y == P1Y
P365D < P1Y
P1Y <= P1Y
Why are dates so a hard!
Did you remove the stone?
Isn't P365D < P1Y
clear, but dependent on your calendar?
Isn't
P365D < P1Y
clear, but dependent on your calendar?
It is dependent on the calendar but also on what time point you add it to:
2020-01-01T00Z + P1Y = 2021-01-01T00Z
2020-01-01T00Z + P365 = 2020-12-31T00Z
2019-01-01T00Z + P365 = 2020-01-01T00Z
Actually I'm thinking it might be safe, and perhaps desirable even, to let P1D
exactly equal PT24H
(and if day is an exact unit that implies week is an exact unit).
Currently when you subtract a TimePoint that is in a different time zone from a TimePoint that is in UTC, the former is converted to UTC to do the calculation. So you won't get P1D != PT24H
there. (And we don't treat DST any differently to just normal time zones).
I can't think of a case where P1D != PT24H
, and I suspect most users will assume P1D == PT24H
anyway. I think only if we implement leap seconds will it be an issue.
From ISO 8601:
The duration of a calendar day is 24 hours; except if modified by:
- the insertion or deletion of leap seconds, by decision of the International Earth Rotation Service (IERS), or
- the insertion or deletion of other time intervals, as may be prescribed by local authorities to alter the time scale of local time.
I don't think the latter applies for a fixed Calendar
in isodatetime
How about just changing the rule for equality and leaving the rest as it is (with a warning) since there is no correct solution
When you say warning is that warnings.warn()
or just leave a note of caution in the docstring? @oliver-sanders
Caution on the docstring I think. A formal warning is a bit over the top.
Problem with making TimePoint hashable: TimePoints representing the same point in time but differing in time zone and whether they're in calendar date mode, ordinal date mode or week date mode all evaluate as equal. But the methods can return different values for each of these.
But the methods can return different values for each of these.
[insert cathartic scream]
Oh god, and it seemed so simple on the surface.
If it's that bad we might want to consider another approach, perhaps we should implement new equality methods which use a more loosely defined equality e.g. TimePoint.equivalent_to
or something like that. But make __eq__
strict as per the pythonic definition.
Good idea, bad idea, better idea?
The good news is that Cylc hides the isodatetime classes behind its cycling classes. Whilst this is a bad architectural decision it does at least mean that the code base isn't doing comparisons on raw isodatetime objects so the impact of this to Cylc is likely to be minimal.
For methods that depend on "variable" properties that don't affect the hash of the instance, could we just create a wrapper method that takes the variable properties as args?
Thus, two instances of different time zones, but representing the same point when converted to UTC, would be equal but the method to get the hours, minutes and seconds wouldn't get cached erroneously?
Something like this:
class TimePoint:
...
def get_hours_mins_secs(self):
return self._get_hours_mins_secs(self.time_zone)
@lru_cache(1000)
def _get_hours_mins_secs(self, _time_zone):
return (self.hour_of_day, self.minute_of_hour, self.second_of_minute)
Without caching, the performance has gone down; the tests on Travis are taking twice as long now.
test_time_point.py
, before:
ncalls tottime percall cumtime percall filename:lineno(function)
2198 0.185 0.000 4.826 0.002 test_time_point.py:72(_do_test_dates)
19782 0.053 0.000 2.154 0.000 data.py:1522(__sub__)
153818 0.268 0.000 1.879 0.000 data.py:357(__init__)
156016 1.105 0.000 1.630 0.000 data.py:2416(_type_checker)
35168 0.101 0.000 1.285 0.000 data.py:1280(__add__)
87920 0.079 0.000 1.216 0.000 data.py:646(copy)
90118 0.075 0.000 1.168 0.000 data.py:642(__init__)
43960 0.218 0.000 0.968 0.000 data.py:1361(copy)
4396 0.024 0.000 0.755 0.000 data.py:1466(__gt__)
37366 0.027 0.000 0.622 0.000 case.py:847(assertEqual)
46158 0.510 0.000 0.562 0.000 data.py:2070(get_calendar_date_from_week_date)
17584 0.154 0.000 0.505 0.000 data.py:1369(get_props)
19782 0.021 0.000 0.477 0.000 data.py:966(get_ordinal_date)
8792 0.005 0.000 0.467 0.000 case.py:840(_baseAssertEqual)
19782 0.013 0.000 0.447 0.000 data.py:2135(get_ordinal_date_from_week_date)
24178 0.078 0.000 0.424 0.000 data.py:508(__mul__)
4396 0.011 0.000 0.405 0.000 data.py:1379(__eq__)
26376 0.031 0.000 0.375 0.000 data.py:939(get_calendar_date)
56930 0.271 0.000 0.347 0.000 data.py:1601(tick_over)
8792 0.012 0.000 0.319 0.000 data.py:1111(set_time_zone)
8792 0.009 0.000 0.303 0.000 data.py:505(__sub__)
Update: managed to get it back down
ncalls tottime percall cumtime percall filename:lineno(function)
2198 0.186 0.000 4.819 0.002 test_time_point.py:72(_do_test_dates)
43960 0.101 0.000 2.729 0.000 data.py:1471(__add__)
56942 0.216 0.000 2.066 0.000 data.py:1720(_tick_over)
19782 0.045 0.000 1.896 0.000 data.py:1639(__sub__)
56942 1.763 0.000 1.826 0.000 data.py:1805(_tick_over_day_of_month)
59310 0.084 0.000 0.610 0.000 data.py:422(__init__)
8792 0.039 0.000 0.548 0.000 data.py:1572(_cmp)
41762 0.362 0.000 0.545 0.000 data.py:2523(_type_checker)
46158 0.212 0.000 0.535 0.000 data.py:1544(_copy)
101072 0.256 0.000 0.418 0.000 data.py:494(_copy)
37366 0.026 0.000 0.372 0.000 case.py:847(assertEqual)
4396 0.004 0.000 0.347 0.000 data.py:1633(__gt__)
17584 0.173 0.000 0.347 0.000 data.py:1552(get_props)
8792 0.011 0.000 0.259 0.000 data.py:1294(to_time_zone)
8792 0.005 0.000 0.220 0.000 case.py:840(_baseAssertEqual)
2237176 0.217 0.000 0.217 0.000 {built-in method builtins.getattr}
4396 0.002 0.000 0.207 0.000 data.py:1624(__eq__)
24178 0.069 0.000 0.204 0.000 data.py:618(__mul__)
1654753 0.158 0.000 0.158 0.000 {built-in method builtins.setattr}
8792 0.008 0.000 0.138 0.000 data.py:615(__sub__)
10990 0.008 0.000 0.129 0.000 data.py:1160(get_ordinal_date)
but the overall test suite still takes about 35% longer on Travis. I guess a (small) part of that is the addition of more tests. Otherwise it might be Travis being funny; locally I get much the same speed for the non-runslow
tests before and after.
(note I've renamed the milestone to 2.1 as this is a pretty significant change)
Let's see how Cylc Flow handles this 😈
Close #162 (plus #166 and #50).
As noted in the discussion below, for the hash of two instances to be the same, the
==
operator has to evaluate as True. This is tricky forDuration
s with nominal units, asP1Y != P365D
(explained in #167).The behaviour settled on, which is not ideal, is to have
P1Y
tacitly equal to 365 days (or ratherCALENDAR.ROUGH_DAYS_IN_YEAR
) for the sake of the<
,<=
,>
etc operators (this behaviour is pretty much unchanged from before), but not actually explicitly equal when it comes to==
.