peterhinch / micropython-async

Application of uasyncio to hardware interfaces. Tutorial and code.
MIT License
742 stars 168 forks source link

Sched: Rework to use Unix build to use UTC functions #126

Closed rhermanklink closed 2 months ago

rhermanklink commented 2 months ago

This Pull Request makes the Unix build use the functions gmtime and timegm, which do not perform DST and work on UTC. This would avoid any issues with scheduling when a DST transition happens and makes the functionality of the Unix build the same as the Hardware builds, which do not perform DST due to constraints from using a hardware RCT.

To keep localtime and mktime on the hardware builds the module defaults to those functions if gmtime is unavailable, which is the case for every build outside the Unix build. To achieve this, the module is changed to use two functions from_time and to_time, which get assigned to either localtime and mktime or gmtime and timegm depending on if gmtime is available or not.

The documentation is updated to reflect this (4.3, 4.5), along with an updated note for calendar behavior (4.2) to state that leap years are accounted. The examples given in the documentation do still use localtime and mktime as this is normal usage on hardware builds, however, this can be changed to use from_time and to_time if desired.

Along with this, a few minor changes were made to make the module work on CPython that maintain compatibility with MicroPython.

peterhinch commented 2 months ago

I'll need some time to properly review this, but I have some concerns regarding testing.

Have you tested this on a microcontroller?

I notice that you support targeting times on the 29th February. This has the potential to throw up problems in non-leap years and I made the decision to disallow this. Please explain your test methodology for this case (under Unix and microcontroller builds).

peterhinch commented 2 months ago

On further thought I have two observations.

  1. Microcontroller builds support gmtime. See docs. Further, on a microcontroller, gmtime and localtime poin to the same function: see code. I think I made a mistake in using localtime, inadvertently breaking the code on Unix. The simple fix is is globally, throughout the library, to replace localtime with gmtime. This will work on either platform.
  2. The decision not to support targeting 29th February was deliberate. If a user schedules an event for that date on every year, the intent on non-leap years is unclear. Should it run on 28th (intent being last day of the month) or only on leap years? The way to break the ambiguity is so simple I decided to avoid the problem. Workround (pseudocode):
    async def run_every_feb_28th():
    if is_leap_year():
        await asyncio.sleep(24*3600)  # Same time on the 29th
        schedule_leap_year_activity()
    else:  # Optionally
        schedule_non_leap_year_activity()

I propose that either you or I simply do the global search and replace.

rhermanklink commented 2 months ago

Regarding your observations

  1. I've replaced the usages of localtime and mktime with gmtime and timegm respectively, along with updating the documentation to suggest the usage of those functions. I tested this on the Unix build and on the esp32 and pi pico builds (using wokwi to emulate, as I do not have any microcontrollers in possession).
  2. I see, that makes sense. I'll revert the is_leap_year usage everywhere except in the implementation timegm, where this is needed. I will test if this works consistently across builds and then (if successful) push it.
rhermanklink commented 2 months ago

I've reverted the usage of is_leap_year back to _mdays as described in my previous comment. Now, scheduling with mday set to 29 with month as 2 in a leap year will throw a value error instead of allowing it, as it is bigger than 28. All the other parameters do still work on leap years as they will just regularly execute, just as was the case before the change to gmtime and timegm.

peterhinch commented 2 months ago

I'm afraid I'm struggling with this. Your timegm function replicates the functionality of time.mktime.

For background, the following works identically under CPython and MicroPython:

gmtime() reurns GM time localtime() returns GM time + 1Hr (local DST) mktime(gmtime()) - mktime(localtime()) returns 0 (mktime produces correct results for time now - epoch start). mktime(gmtime()) returns same value on CPython and MP: seconds since epoch (verifies MP is correct).

The following test, pasted into MP Unix build, confirms this:

from time import mktime, gmtime

def is_leap_year(year):
    return year % 4 == 0 and (year % 100 != 0 or year % 400 == 0)

def days_in_month(month, year=gmtime()[0]):
    if month == 2:
        return 29 if is_leap_year(year) else 28
    elif month in [4, 6, 9, 11]:
        return 30
    else:
        return 31

def timegm(time_tuple):
    """Calculate Unix timestamp from UTC."""

    yr, mo, md, h, m, s = time_tuple[:6]

    # calculate days since epoch
    days = 0
    for y in range(1970, yr):
        days += 366 if is_leap_year(y) else 365
    for mot in range(1, mo):
        days += days_in_month(mot, yr)
    days += md - 1

    hours = days * 24 + h
    minutes = hours * 60 + m
    seconds = minutes * 60 + s

    return seconds

print(mktime(gmtime()))
print(timegm(gmtime()))

Both results are identical: there seems no point in implementing this code.

My second difficulty is more wide ranging. The module was primarily designed for microcontrollers which have no concept of DST or of timezones. The docs make clear, when referencing Unix use, that local time is used and warns to beware of DST effects. Initially you convinced me that changing to UTC makes sense in avoiding DST ambiguities; but on reflection this would be severely problematic for someone living in California or Sydney. I think I adopted local time to sidestep this issue.

It is (obviously) possible to fix this - see my astronomy module. The question is whether the added complexity is justified when it brings nothing to the table for microcontroller users.

A further doubt is that the Python schedule module exists. This is designed for PC use.

peterhinch commented 2 months ago

I suspect we are both over-complicating this.

It seems that the OS provides much more support than I had appreciated. This applies to CPython and MicroPython. Consider the following test, which adds two months to the current date and returns the time difference in seconds. The timespan crosses the boundary of a DST change in the UK, so I expected to see different results under Unix compared to a hardware platform.

In practice the test passed under all these conditions (note that to replicate these results on a hardware platform, the RTC should be set to the correct time and date):

In consequence I think that the current published code should work on Unix worldwide and across DST changes, unless of course you can provide evidence to the contrary.

from time import mktime, gmtime, localtime
from sys import implementation
cpython = implementation.name == 'cpython'
if cpython:
    from time import struct_time

start = [2024, 9, 5, 11, 49, 2, 3, 249, 1]
sept = round(mktime(struct_time(start))) if cpython else mktime(start)

end = start[:]
end[1] += 2  # Same time + 2months Crosses DST boundary in the UK

november = round(mktime(struct_time(end))) if cpython else mktime(end)
print(november - sept)

if november - sept == 5270400:
    print('PASS')
else:
    print('FAIL')

[EDIT] To clarify my reasoning.

The published code, run on a microcontroller, cannot possibly have DST or TZ bugs as the underlying time source does not support these constructs. To demonstrate the possibility of bugs when running under Unix, we therefore need a script which (under specified circumstances) produces different results when running under Unix compared with those produced on a microcontroller.

Despite several attempts, I have failed to achieve this.

peterhinch commented 2 months ago

I have added this note to the docs.

Thanks for bringing this issue to my attention, but my conclusion is that the current code should work correctly. If you find evidence to the contrary, please raise a new issue/PR.

peterhinch commented 2 months ago

On further study, in a locale with DST, there are potential bugs when running under Unix.

If, while scheduling a sequence of events, DST causes a jump in system time, there may be unexpected consequences. These may include missed events.

A solution may be more involved than those discussed so far. Given that the primary target is microcontrollers it's possible that the outcome will be to document the constraints rather than posting a fix.

peterhinch commented 2 months ago

I have experimented with a solution but it proved problematic.

Given that the primary target platforms are microcontrollers, I have concluded that this issue is in the "document, won't fix" category. In my view use of this module on the Unix build is primarily to develop microcontroller code, and there are schedulers better suited to actual Unix applications.

I have updated SCHEDULE.md accordingly.