python / cpython

The Python programming language
https://www.python.org/
Other
60.83k stars 29.36k forks source link

zoneinfo may give incorrect dst() in Europe/Minsk in 1942 #85105

Open pganssle opened 4 years ago

pganssle commented 4 years ago
BPO 40933
Nosy @pganssle

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/pganssle' closed_at = None created_at = labels = ['3.8', '3.9'] title = 'zoneinfo may give incorrect dst() in Europe/Minsk in 1942' updated_at = user = 'https://github.com/pganssle' ``` bugs.python.org fields: ```python activity = actor = 'p-ganssle' assignee = 'p-ganssle' closed = False closed_date = None closer = None components = [] creation = creator = 'p-ganssle' dependencies = [] files = [] hgrepos = [] issue_num = 40933 keywords = [] message_count = 1.0 messages = ['371141'] nosy_count = 1.0 nosy_names = ['p-ganssle'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue40933' versions = ['Python 3.8', 'Python 3.9'] ```

pganssle commented 4 years ago

Related to bpo-40930 and bpo-40931, it seems that in 1942 only, zoneinfo.ZoneInfo returns -01:00 for DST in Europe/Minsk:

    >>> from datetime import datetime, timedelta
    >>> from backports.zoneinfo import ZoneInfo
    >>> datetime(1942, 1, 1, tzinfo=ZoneInfo("Europe/Minsk")).dst() // 
    timedelta(hours=1)

It looks like this occurs because they transitioned directly from MSK to CEST, jumping back 1 hour, then started switching between CEST and CET.

$ zdump -V -c 1941,1944 'Europe/Minsk'
Europe/Minsk  Fri Jun 27 20:59:59 1941 UT = Fri Jun 27 23:59:59 1941 MSK isdst=0 gmtoff=10800
Europe/Minsk  Fri Jun 27 21:00:00 1941 UT = Fri Jun 27 23:00:00 1941 CEST isdst=1 gmtoff=7200
Europe/Minsk  Mon Nov  2 00:59:59 1942 UT = Mon Nov  2 02:59:59 1942 CEST isdst=1 gmtoff=7200
Europe/Minsk  Mon Nov  2 01:00:00 1942 UT = Mon Nov  2 02:00:00 1942 CET isdst=0 gmtoff=3600
Europe/Minsk  Mon Mar 29 00:59:59 1943 UT = Mon Mar 29 01:59:59 1943 CET isdst=0 gmtoff=3600
Europe/Minsk  Mon Mar 29 01:00:00 1943 UT = Mon Mar 29 03:00:00 1943 CEST isdst=1 gmtoff=7200
Europe/Minsk  Mon Oct  4 00:59:59 1943 UT = Mon Oct  4 02:59:59 1943 CEST isdst=1 gmtoff=7200
Europe/Minsk  Mon Oct  4 01:00:00 1943 UT = Mon Oct  4 02:00:00 1943 CET isdst=0 gmtoff=3600

This might get fixed automatically if we do the "plurality" heuristic in bpo-40930, though we might also consider a heuristic that puts greater weight on a transition if the names associated with them different only by transforming a single letter, or insertion of a letter.

I am somewhat puzzled as to why only 1943 is affected, since I would have thought that all the CEST offsets in that stretch would be considered the same ttinfo (and thus all would be assigned the same dstoff).