python / cpython

The Python programming language
https://www.python.org
Other
62.47k stars 29.99k forks source link

zoneinfo gives incorrect dst() in Pacific/Rarotonga between 1978 and 1991 #85102

Open pganssle opened 4 years ago

pganssle commented 4 years ago
BPO 40930
Nosy @pganssle

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/pganssle' closed_at = None created_at = labels = ['type-bug', 'library', '3.9', '3.10'] title = 'zoneinfo gives incorrect dst() in Pacific/Rarotonga between 1978 and 1991' updated_at = user = 'https://github.com/pganssle' ``` bugs.python.org fields: ```python activity = actor = 'p-ganssle' assignee = 'p-ganssle' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'p-ganssle' dependencies = [] files = [] hgrepos = [] issue_num = 40930 keywords = [] message_count = 1.0 messages = ['371136'] nosy_count = 1.0 nosy_names = ['p-ganssle'] pr_nums = [] priority = 'normal' resolution = None stage = 'needs patch' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue40930' versions = ['Python 3.9', 'Python 3.10'] ```

pganssle commented 4 years ago

While developing a shim for deprecating pytz, I discovered this issue with the Pacific/Rarotonga zone:

  >>> from datetime import datetime, timedelta               
  >>> from backports.zoneinfo import ZoneInfo                
  >>> datetime(1991, 2, 1, tzinfo=ZoneInfo("Pacific/Rarotonga")).dst() / 
  timedelta(hours=1)                             
  1.0

This reports that the DST offset is 1 hour, but in fact it should be 30 minutes, because from 1978 to 1991, Pacific/Rarotonga alternated between -0930 and -10:

$ zdump -V -c 1990,1993 'Pacific/Rarotonga'
Pacific/Rarotonga  Sun Mar  4 09:29:59 1990 UT = Sat Mar  3 23:59:59 1990 -0930 isdst=1 gmtoff=-34200
Pacific/Rarotonga  Sun Mar  4 09:30:00 1990 UT = Sat Mar  3 23:30:00 1990 -10 isdst=0 gmtoff=-36000
Pacific/Rarotonga  Sun Oct 28 09:59:59 1990 UT = Sat Oct 27 23:59:59 1990 -10 isdst=0 gmtoff=-36000
Pacific/Rarotonga  Sun Oct 28 10:00:00 1990 UT = Sun Oct 28 00:30:00 1990 -0930 isdst=1 gmtoff=-34200
Pacific/Rarotonga  Sun Mar  3 09:29:59 1991 UT = Sat Mar  2 23:59:59 1991 -0930 isdst=1 gmtoff=-34200
Pacific/Rarotonga  Sun Mar  3 09:30:00 1991 UT = Sat Mar  2 23:30:00 1991 -10 isdst=0 gmtoff=-36000

I believe that the error comes from the fact that before 1978, they were on -1030 time, then they transitioned to -0930, then started alternating between -0930 and -10:

$ zdump -V -c 1977,1980 'Pacific/Rarotonga'
Pacific/Rarotonga  Sun Nov 12 10:29:59 1978 UT = Sat Nov 11 23:59:59 1978 -1030 isdst=0 gmtoff=-37800
Pacific/Rarotonga  Sun Nov 12 10:30:00 1978 UT = Sun Nov 12 01:00:00 1978 -0930 isdst=1 gmtoff=-34200
Pacific/Rarotonga  Sun Mar  4 09:29:59 1979 UT = Sat Mar  3 23:59:59 1979 -0930 isdst=1 gmtoff=-34200
Pacific/Rarotonga  Sun Mar  4 09:30:00 1979 UT = Sat Mar  3 23:30:00 1979 -10 isdst=0 gmtoff=-36000
Pacific/Rarotonga  Sun Oct 28 09:59:59 1979 UT = Sat Oct 27 23:59:59 1979 -10 isdst=0 gmtoff=-36000
Pacific/Rarotonga  Sun Oct 28 10:00:00 1979 UT = Sun Oct 28 00:30:00 1979 -0930 isdst=1 gmtoff=-34200

This is not amazingly important, but it would be a good idea to make the result correct.

Right now I think the heuristic looks for the first example of an STD → DST transition and decides that's the best option. It might be a good idea to change the heuristic to look at *all* examples of such transitions and choose the plurality value, and if there's no plurality value and one of the values is 1H, choose that one.