python / cpython

The Python programming language
https://www.python.org
Other
63.4k stars 30.36k forks source link

Improvement mktime error reporting #88579

Open 6ce66291-3b82-429b-bf6d-112c02f77f99 opened 3 years ago

6ce66291-3b82-429b-bf6d-112c02f77f99 commented 3 years ago
BPO 44413
Nosy @malemburg, @terryjreedy, @abalkin, @tiran, @pganssle, @corona10, @Fidget-Spinner, @akulakov, @wasade
PRs
  • python/cpython#30124
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', '3.9', '3.10', '3.11'] title = 'Improvement mktime error reporting' updated_at = user = 'https://bugs.python.org/psijic' ``` bugs.python.org fields: ```python activity = actor = 'wasade' assignee = 'none' closed = False closed_date = None closer = None components = [] creation = creator = 'psijic' dependencies = [] files = [] hgrepos = [] issue_num = 44413 keywords = [] message_count = 15.0 messages = ['395758', '396102', '397600', '407346', '408615', '408627', '408642', '408644', '408650', '408658', '408659', '408660', '408977', '408978', '408979'] nosy_count = 10.0 nosy_names = ['lemburg', 'terry.reedy', 'belopolsky', 'christian.heimes', 'p-ganssle', 'corona10', 'kj', 'andrei.avk', 'psijic', 'wasade'] pr_nums = ['30124'] priority = 'normal' resolution = None stage = 'needs patch' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue44413' versions = ['Python 3.9', 'Python 3.10', 'Python 3.11'] ```

    6ce66291-3b82-429b-bf6d-112c02f77f99 commented 3 years ago
    date_str = "2019-06-06 10:02:00"
        datetime_obj = datetime.strptime(date_str, "%Y-%m-%d %H:%M:%S")
        datetime_obj_utc1 = datetime_obj.replace(tzinfo=pytz.timezone("America/New_York"))
        datetime_obj_utc2 = pytz.timezone("America/New_York").localize(datetime_obj)
        print(datetime_obj_utc1, datetime_obj_utc2)
        ts1 = time.mktime(datetime_obj_utc1.timetuple())
        ts2 = time.mktime(datetime_obj_utc2.timetuple())
    >>> Output:
        ts2 = int(time.mktime(datetime_obj_utc2.timetuple()))
    OverflowError: mktime argument out of range

    OS: Linux OpenSuse TW

    terryjreedy commented 3 years ago

    An exception is not a 'crash', as defined for this tracker. 3.8 only gets security fixes. A 3.9 or 3.10 test is needed and Windows does not have pytz.

    terryjreedy commented 3 years ago

    The posted code is incomplete, improperly indented, and a bit jumbled with respect to prompts. If the bug is real, it should be possible to reproduce without pytz. I do not have it on either Windows or macOS. Vyacheslav, can you post a better failing example?

    akulakov commented 2 years ago

    Cannot reproduce with 3.8.6 on MacOS. The code runs without any errors. As OP hasn't responded in 4 months I think we should close as "works for me".

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    I'd like to politely request this issue be reopened.

    We recently observed a similar, if not the same, OverflowError during continuous integration using Github Actions on ubuntu-latest (20.04). We can produce the error using pure Python without pytz.

    We have been unable to replicate the error using Github Actions on macos-latest, Centos 7.9, Ubuntu 18.04 or OSX 11.6.1.

    Specifically, the following code, which I believe is pure python, will trigger the overflow. We've observed this using Python 3.7 and 3.9.

    import time
    print(time.mktime((2017,5,26,15,30,16,4,146,1)))

    Exact error output from CI can be found at the link below:

    https://github.com/biocore/microsetta-private-api/runs/4536611219?check_suite_focus=true#step:4:117

    On a passing system, we receive "1495837816.0" as output. On a failing system, we observe "OverflowError: mktime argument out of range".

    More detail can be found on our issue with pytz and stub42 who helped guide a more definitive example of the bug showing it could be pure Python. That issue can be found here:

    https://github.com/stub42/pytz/issues/66

    Last, I apologize in advance if any detail is missing here or if this is not formatted as well as it should be. This is my first bug report with Python and am unfamiliar with the norms. Please let me know if further information may be helpful.

    akulakov commented 2 years ago

    Confirmed on Ubuntu buildbot:

    https://github.com/python/cpython/runs/4537544103?check_suite_focus=true

    terryjreedy commented 2 years ago

    Thank you both. Status: Failures on Open Suse TW (Vyacheslav) and Ubuntu 20-04 (Daniel McDonald, Github Actions, and Azure Pipelines). Success on Windows (me), macOS (Catalina-me and 11.6.1-DM), Centos 7.9 and Ubuntu 18-04 (both DM)

    I verified that the sample tuple is consistent and round-trips with time.localtime.

    >>> import time
    >>> time.mktime((2017,5,26,15,30,16,4,146,1))
    1495827016.0
    >>> time.localtime(time.mktime((2017,5,26,15,30,16,4,146,1)))
    time.struct_time(tm_year=2017, tm_mon=5, tm_mday=26, tm_hour=15, tm_min=30, tm_sec=16, tm_wday=4, tm_yday=146, tm_isdst=1)

    While OverflowError is documented as a legitimate response, it is not for a contemporary valid datetime such as the above. Someone with a failure machine could try to determine what tuple vales do and don't result in the error. Someone with both failure and success machines could add debug prints to mktime (or use C debugger) to see where the behavior of the C code diverges on the two machines. If no one active on the issue can do either, a request could be made for help on python-list.

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    Thank you, Terry. I'm currently exploring modifications to the test Andrei made within a fork of CPython using Github Actions (ubuntu-latest). These modifications include debug prints in the CPython mktime call, and some parameter exploration. I expect to have a summary of the observations this afternoon.

    tiran commented 2 years ago

    time.mktime() is a thin wrapper around the libc function mktime(3). I can confirm that glibc's mktime() on Debian 11 with glibc 2.31 is failing and returning error code -1. Fedora 35 with glibc 2.34, Alpine with musl libc 1.2.2, and RHEL 8 with glibc 2.28 are working fine.

    This suggests that it is not a bug in Python, but Debian-specific platform issue. Could you please open a bug with Debian and Ubuntu?

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    The use of tm_isdst=1 appears to trigger the overflow, and occurs when varying other aspects of the timetuple.

    Python's mktime wrapper can throw OverflowError in two places. The thrown error is the second location, following the call to glibc's mktime occurring in this specific if block:

    https://github.com/python/cpython/blob/f62420c3d3f5d87f2b57e54b2a98682bc835f7b6/Modules/timemodule.c#L1038-L1046

    Modification of Modules/timemodule.c as to printf tm struct members confirms that when tm_isdst=1, the result from mktime() is -1, and the tm_wday member is set to -1, the combination of which triggers the conditional.

    mktime within the Github Action ubuntu-latest (20.04) environment is sourced from glibc 2.31 as confirmed by ldd and nm on the compiled Python binary.

    As a proof of concept, C program was written that calls mktime independent of CPython (can be compiled with "gcc bug.c -o bug"). The code below succeeds on both OSX 11.6.1 and Centos 7.9:

    #include <time.h>
    #include <stdio.h>
    
    void do_test() {
      struct tm tm_works = { .tm_year=117,
                             .tm_mon=4,
                             .tm_mday=26,
                             .tm_hour=15,
                             .tm_min=30,
                             .tm_sec=16,
                             .tm_wday=4,
                             .tm_yday=145,
                             .tm_isdst=-1 };
    
      struct tm tm_fails = { .tm_year=117,
                             .tm_mon=4,
                             .tm_mday=26,
                             .tm_hour=15,
                             .tm_min=30,
                             .tm_sec=16,
                             .tm_wday=4,
                             .tm_yday=145,
                             .tm_isdst=1 };

    time_t works = mktime(&tm_works); time_t fails = mktime(&tm_fails);

      if(works == -1) {
          printf("Unexpected failure\n");
      } else {
          if(works == fails) {
              printf("Test passed\n");
          } else {
              printf("Test failed: works=%d; fails=%d\n", (int)works, (int)fails);
          }
      }
    }
    
    int main(int argc, char **argv) {
        do_test();
    }

    When compiled and run within in the Github Actions ubuntu-latest (20.04) environment, the erroneous behavior occurs:

    https://github.com/wasade/cpython/runs/4541212472?check_suite_focus=true#step:17:57

    The use of tm_isdst=1 is valid according to multiple sources, a few examples are linked below. The sources are consistent, and indicate a positive value means Daylight Savings Time is in effect.

    https://www.gnu.org/software/libc/manual/html_node/Broken_002ddown-Time.html https://www.cplusplus.com/reference/ctime/tm/

    At this time, I believe this remains unexpected behavior, however the source appears upstream of CPython as mktime is part of glibc. Unless suggested otherwise, I'll open an issue with on the glibc tracker.

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    Thank you, Christian, I apologize for missing your reply. That is great advice, and I will do so.

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    For reference, the bug reports with Debian and Ubuntu are at the following URLs:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001774 https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1954963

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    A definitive assessment was obtained by Aurelien with Debian:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001774

    In brief, the OverflowError is expected and correct.

    Two factors contribute to the difference in observed behavior. First, the production of the -1 from mktime which Python uses to raise OverflowError depends on whether timezone is or is not set (e.g. "Etc/UTC"). This was confirmed on the Github Actions 20.04 ubuntu instance independent of Python directly with the mktime() function from C. Second, mktime returning -1 to indicate overflow is dependent on glibc >= 2.29, and versions prior "do not report any error if the date is not representable." This second factor is supported by testing the lack of the -1 (and subsequent OverflowError) on Github Actions ubuntu 18.04, which uses an older version of glibc.

    The full thread with Debian is here:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001774

    With regards to CPython, a possible change to consider would be to test if tm_isdst \< 0 prior to the mktime call, and if that evaluates true to provide an error indicating use requires a timezone being set. I think that change would make Python's behavior consistent with glibc >= 2.29, and smooth over a potential platform dependency, while also providing an informative error message. However, I have limited knowledge of how timezones are implemented and am unsure if more than "Etc/UTC" would result in tm_isdst being \< 0.

    If that change makes sense, I'd be happy to submit a pull request to that effect. However, I recognize there may be considerations not accounted for here particularly as this is a central module, so I'll hold off issuing a pull request until hearing back from on this matter.

    terryjreedy commented 2 years ago

    Daniel, I believe your proposal is an enhancement request that could only be implemented in a future version. I changed headers accordingly. I leave it to Christian or someone else to evaluate the request itself.

    2a0162a6-043c-497f-ba6a-4606ad48834f commented 2 years ago

    Sounds good, thank you, Terry