Closed Chronial closed 5 years ago
You can try to make a PR yes, but DST is an evil beast that we (re)implement things even now and then.
Please add tests, and try to make existing still pass without mods.
Thx !
Thanks for the quick response. Looking at this some more, I think this would only be possible by introducing a dependency on a timezone library, since the default tzinfo
api does not provide enough information.
Such a dependency is probably not a good idea for this project. I think the solution of cron
should be possible with the default tzinfo
api.
we already depend on pytz, is that not enought ?
whoops, no, only python_dateutil. I would prefer to add pytz.
With pytz, I should be able to make this work.
@Chronial i think python_dateutil might be sufficient, see https://dateutil.readthedocs.io/en/stable/tz.html
I need the actual list of dst transition points, but that should be available from python_datetutil, too. But I'm not sure if that makes sense with the current interface – I think it would be cleaner to pass the timezone name to croniter.
I will have to construct my own tzinfo
anyways, since I need a timezones for the smear period.
The implementation is done, but I'm not sure about how to do the API. The concept is this: Given a timezone name, I construct a converter (utilizing the dateutil db) that converts between local wall-time and utc time. This conversion smears DST transitions as described above.
Wrapped in these conversions, croniter can just operate naively without understanding of timezones, so it should support these two interfaces:
But that would not be backwards-compatible with the current interface. Alternatively, I could try to extract the timezone data from the tzinfo of the passed-in start time. But that would be a bit hacky, and might not always work, because the passed-in timezone might be anything and not come from dateutil.
What do you think?
That we must find a non breaking way to do it :(, croniter begins to be heavily used.
I could just add a new timezone parameter that activates timezone-aware mode, with the current dst handling staying active if its not passed. But that would leave duplicate dst code and be confusing to users. Also, the current DST handling is broken.
I implemented this for now in my project by wrapping croniter. I would gladly contribute to this project, but you will have to make an API decision.
Well, the API does not have to be broken. Timezone can be optional and this only for naive datetim.
If the datetime we received is naive, and we dont give a timezone,. we consider it local. We then get the local tz and convert this input datetime to non-naive local timezoned datetime, and continue with that.
obsolete DST code that is better handled now by using non naive datetimes can be rewritten assuring that current tests pass (or are changed to fixed tests ;)).
How should non-naive datetimes without a given timezone be handled?
Just to clarify, I mean this call: croniter(crontab, datetime.datetime(2000, 1, 1, tzinfo=some_timezone)
We should extract the timezone from the datetime instance.
Ah, that's the problem. This is not always possible. In order to smear DST, I need the DST transition points, but there is no standardized API for that. I can special-case pytz and dateutil timezones, but this won't work for any other timezones.
no activity for a long time.
@Chronial, is your work captured anywhere that's accessible to the public? I'm also quite curious about resolving some timezone/DST transition issues and I'd like to see how far you were able to go with this approach. You mentioned that you had a project that wrapped croniter, is that available?
Yes, that is on github. See https://github.com/djangsters/redis-tasks/blob/master/redis_tasks/smear_dst.py and the way it is used: https://github.com/djangsters/redis-tasks/blob/e5767ad5c391daba4593404f6ddae788ef9d0ba5/redis_tasks/scheduler.py#L29-L32
It's been a while since I wrote that, but I think that should implement exactly what I explained in the first post.
I think the best way to handle DST transition is to "smear" them. That means:
I think the most important aspect of DST shift handling is that tasks run as often as expected. So a task that is scheduled for a fixed time of day should run every day – independent of whether a DST shift happened at that day. A task that is scheduled to run once per hour should always run exactly 24 times a day.
An alternative to my suggestion is what
cron
is doing:The smearing method has two advantages:
cron
way, there is an hour in which no tasks will be started.cron
will smush all events that would have happened over an our into a single second, so instead of happening after one another, they all happen at once.If the maintainers agree with this approach, I will prepare a PR. This should fix #90 and #91.