python / cpython

The Python programming language
https://www.python.org
Other
63.39k stars 30.36k forks source link

datetime lacks concrete tzinfo implementation for UTC #49344

Closed brettcannon closed 14 years ago

brettcannon commented 15 years ago
BPO 5094
Nosy @tim-one, @doerwalter, @brettcannon, @mdickinson, @abalkin, @pitrou, @devdanzin, @ezio-melotti, @merwok, @bitdancer, @durban, @4kir4
Files
  • next-patch.txt: against 2.7
  • issue5094.diff
  • issue5094a.diff
  • localtime.py: aware local time implementation
  • datetimeex.py: Python prototype for 3-argument timezone
  • issue5094b.diff
  • issue5094c.diff
  • issue5094d.diff
  • issue5094d1.diff: Fixed utcnow() documentation. No code change from issue5094d.diff.
  • issue5094e.diff
  • issue5094f.diff: 'UTC±HH:MM' and more unit tests
  • issue5094g.diff
  • issue5094h.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/abalkin' closed_at = created_at = labels = ['extension-modules', 'type-feature'] title = 'datetime lacks concrete tzinfo implementation for UTC' updated_at = user = 'https://github.com/brettcannon' ``` bugs.python.org fields: ```python activity = actor = 'eric.araujo' assignee = 'belopolsky' closed = True closed_date = closer = 'belopolsky' components = ['Extension Modules'] creation = creator = 'brett.cannon' dependencies = [] files = ['17455', '17525', '17533', '17541', '17544', '17569', '17570', '17585', '17586', '17631', '17648', '17657', '17680'] hgrepos = [] issue_num = 5094 keywords = ['patch'] message_count = 91.0 messages = ['80735', '80740', '80745', '80807', '80857', '80866', '81043', '81086', '81119', '81641', '81649', '82818', '106403', '106411', '106412', '106413', '106414', '106415', '106422', '106445', '106476', '106478', '106483', '106484', '106485', '106487', '106490', '106493', '106494', '106496', '106498', '106518', '106565', '106911', '106914', '106920', '106923', '106971', '106973', '106974', '106976', '106977', '106980', '106997', '106998', '107006', '107008', '107059', '107060', '107072', '107092', '107105', '107107', '107158', '107186', '107189', '107190', '107212', '107279', '107290', '107291', '107545', '107546', '107547', '107548', '107552', '107554', '107569', '107608', '107628', '107668', '107670', '107676', '107683', '107733', '107737', '107738', '107742', '107743', '107786', '107787', '107819', '107874', '107886', '107889', '107890', '107891', '107892', '107893', '107894', '107901'] nosy_count = 17.0 nosy_names = ['tim.peters', 'doerwalter', 'brett.cannon', 'mark.dickinson', 'belopolsky', 'ggenellina', 'pitrou', 'techtonik', 'ajaksu2', 'kawai', 'ezio.melotti', 'eric.araujo', 'r.david.murray', 'rafe', 'daniel.urban', 'l0nwlf', 'akira'] pr_nums = [] priority = 'high' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue5094' versions = ['Python 3.2'] ```

    brettcannon commented 15 years ago

    When you call datetime.datetime.utcnow() you get back a naive datetime object. But why? You asked for UTC as the timezone based on what method call you made. And UTC is a very concrete timezone that never changes.

    It would be nice to have a concrete UTC tzinfo class that utcnow() uses so that at least those datetime instances are non-naive.

    If people have no issues with making this happen I will write the code for the concrete UTC tzinfo instance and make the appropriate changes to utcnow().

    90baf024-6604-450d-8341-d796fe6858f3 commented 15 years ago

    Brett, It might be worth to update tzinfo-examples.py to use your concrete UTC then:

    http://svn.python.org/view/python/trunk/Doc/includes/tzinfo-examples.py?rev=62214&view=markup

    brettcannon commented 15 years ago

    On Wed, Jan 28, 2009 at 18:17, Daniel Diniz \report@bugs.python.org\ wrote:

    Daniel Diniz \ajaksu@gmail.com\ added the comment:

    Brett, It might be worth to update tzinfo-examples.py to use your concrete UTC then:

    I will if people are generally okay with the idea of adding this class.

    cd9c73b9-729f-435e-b6d3-dbcd2e3e1c6b commented 15 years ago

    I want UTC tzinfo. too.

    1fd7a44c-f7f2-43ed-9c9f-bafa512b8598 commented 15 years ago

    Seems perfectly reasonable to me.

    pitrou commented 15 years ago

    Please do. The current situation where the doc tells you not to use "naive" datetime objects but Python gives you no way to do so is awful.

    vstinner commented 15 years ago

    The UTC class have to be converted to C. Can someone write a patch for datetimemodule.c (and the doc plus an unit test ;-))?

    brettcannon commented 15 years ago

    On Tue, Feb 3, 2009 at 03:28, STINNER Victor \report@bugs.python.org\ wrote:

    STINNER Victor \victor.stinner@haypocalc.com\ added the comment:

    The UTC class have to be converted to C.

    Yes, the example code is just an example. =)

    Can someone write a patch for datetimemodule.c (and the doc plus an unit test ;-))?

    I might have some people lined up to take this on.

    3aeecd78-b2cd-4760-a6cc-dae21f06c5b2 commented 15 years ago

    I'm going to attempt to implement this feature.

    doerwalter commented 15 years ago

    The patch doesn't include any changes to the documentation.

    3aeecd78-b2cd-4760-a6cc-dae21f06c5b2 commented 15 years ago

    I thought I had uploaded this last night, apologies.

    brettcannon commented 15 years ago

    I am currently doing a review of the patch over at http://codereview.appspot.com/22042 .

    brettcannon commented 14 years ago

    Attaching a new patch by Rafe against Python 2.7. Unfortunately with 2.7 striving for an RC next, this should only target Python 3.2 and not 2.7.

    abalkin commented 14 years ago

    I have two questions about the proposed implementation:

    1. Why not follow pytz lead and expose an instance of UTC rather than the UTC class itself?

    2. Is there a real need to add a boolean argument to utcnow()? I think timedelta.now(UTC()) or with utc = UTC() timedelta.now(utc) seems to be a more obvious way to produce TZ aware datetime.

    If a singleton instance utc is exposed instead of UTC class, I would suggest to change its repr to 'datetime.utc'.

    On the patch itself, datetime_utcnow() is missing an error check for PyObject_IsTrue() return value:

    >>> class X:
    ...    def __nonzero__(self): raise RuntimeError
    ... 
    >>> datetime.utcnow(tz_aware=X())
    datetime.datetime(2010, 5, 25, 2, 12, 14, 739720, tzinfo=<datetime.UTC object at 0x1015aab80>)
    XXX undetected error
    ..
    3aeecd78-b2cd-4760-a6cc-dae21f06c5b2 commented 14 years ago

    Alexander, about 1, that's a pretty good question. I had originally wanted to do something like that but Brett Cannon at the time thought it was not the right approach. I don't recall the details. Maybe Brett can recall. I think we had that conversation in person and it was a long time ago :(

    I had originally thought of doing the class, and then having constants associated with it:

    UTC.UTC0

    Eventually if we support multiple timezones:

    UTC.UTC1 UTC.UTC2 UTC.UTC_1 UTC.UTC_2

    Well... maybe not given how impossible the naming would be.

    I think we also talked about redefining new so that it would always return the same global instance.

    On 2, we had discussions about how to pass parameters in to utcnow that we DID record. When I suggested it, Brett said:

    "...using a boolean flag over an argument is much less error-prone for a developer from passing in the wrong timezone object; passing in something other than an instance of UTC would just be stupid so we should make sure the developer isn't stupid. =)"

    brettcannon commented 14 years ago

    We didn't do a singleton because I don't like singletons. =) Plus they muck with isinstance/issubclass if you don't expose the class. Basically there is no need to have it be a singleton, so why bother?

    And Rafe is right: since utcnow() already exists, why not take advantage of the method? Yes, you could manually call now() with a UTC object, but people are going to notice the utcnow() method and want to use it, so we should make it easy to use the new UTC object on utcnow(). Plus it has the added benefit of providing a transition plan to make utcnow() always return a timezone-aware datetime object.

    abalkin commented 14 years ago

    On Mon, May 24, 2010 at 11:06 PM, Rafe Kaplan \report@bugs.python.org\ wrote: ..

    On 2, we had discussions about how to pass parameters in to utcnow that we DID record.  When I suggested it, Brett said:

     "...using a boolean flag over an argument is much less error-prone for a developer from passing in the wrong timezone object; passing in something other than an instance of UTC would just be stupid so we should make sure the developer isn't stupid. =)"

    Well, I respectfully disagree. This advise seems to be placing convenience of the writer of the code over that of the reader. Imagine encountering an expression datetime.utcnow(True) allowed by your current patch and trying to figure out what it means. This can be improved by making tz_aware a keyword only argument, but in that case a separate datetime.tz_aware_utcnow() function seems like a better choice.

    Note that I am not suggesting passing anything to utcnow(). I would either leave it unchanged or make it always return aware datetime instances. (Note that with singleton UTC timezone naive datetime instances can be deprecated with no performance penalty.)

    abalkin commented 14 years ago

    We didn't do a singleton because I don't like singletons. =) Plus they muck with isinstance/issubclass if you don't expose the class.

    I am not sure what you mean by "muck with." Why would anyone want to subclass UTC?

    Basically there is no need to have it be a singleton, so why bother?

    There are several advantages of having all datetime instances with tzinfo=UTC() referring to the same instance:

    1. Comparison (and I believe subtraction) of aware datetime instances bypasses calculation of utcoffset if their tzinfo attributes refer to the same object.

    2. With the current patch,

    >>> set(UTC() for i in range(3))
    set([<datetime.UTC object at 0x1015aac80>, <datetime.UTC object at 0x1015aad00>, <datetime.UTC object at 0x101a0e040>])

    I don't think this is intended. Basically UTC() instances cannot be meaningfully compared or used as dictionary or set keys. You can fix it by providing custom __eq and __hash, but this problem simply goes away if a singleton is exposed instead.

    1. now(utc) is slightly more readable than now(UTC())

    2. Singleton utc is familiar to pytz users.

    pitrou commented 14 years ago

    Note that I am not suggesting passing anything to utcnow(). I would either leave it unchanged or make it always return aware datetime instances.

    The latter would break compatibility, though (especially given how comparison between "naive" and "aware" datetimes raises an error...).

    I also agree with Brett that a singleton looks rather unnecessary (it also look quite C++/Java-esque to me).

    On the subject of the patch:

    abalkin commented 14 years ago

    On Tue, May 25, 2010 at 5:45 AM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    I also agree with Brett that a singleton looks rather unnecessary (it also look quite C++/Java-esque to me).

    I still don't understand your aversion to singletons and you did not address any of the advantages that I listed in my previous comment. I don't think singletons are foreign to Python: after all we write None rather than NoneType() .

    We can reach a middle ground by interning UTC instances behind the scenes so that UTC() is UTC() will always be true. This will address most of the issues that I raised and utc = datetime.UTC() is simple enough to write as long as you don't have to worry about sharing utc instance between modules.

    abalkin commented 14 years ago

    On Mon, May 24, 2010 at 11:06 PM, Rafe Kaplan \report@bugs.python.org\ wrote: ..

    I had originally thought of doing the class, and then having constants associated with it:

     UTC.UTC0

    Eventually if we support multiple timezones:

     UTC.UTC1 ..

    Well... maybe not given how impossible the naming would be. ..

    I actually like your original idea. It seems wasteful to create a concrete timezone class in a C module and only use it for a single timezone. FixedOffset class in tzinfo-examples.py is only slightly more complicated than UTC class and as explained in the comment above it, "FixedOffset(0, "UTC") is a different way to build a UTC tzinfo object. FixedOffset objects can then be used to produce aware datetime instances from strptime. (See bpo-6641.) I would only define utc = FixedOffset(0, "UTC") instance and make name argument to FixedOffset optional defaulting to UTC(+/-)hhmm.

    brettcannon commented 14 years ago

    The singleton dislike from Antoine and me is that they are generally just not liked in the stdlib. None/True/False are special cases because they are syntax, so having None is None ever not work would just be weird. Otherwise singletons are unnecessary in Python. Just look through the stdlib and you will find very few singletons as they are generally considered bad. Having to write a custom eq or hash is just part of being explicit. And trying to make a factory function that always returns the same instance is not a solution either. I understand pytz might use them, but this is the stdlib, so we need to go with what we consider best practice for Python since it will lead to much more use than pytz gets.

    Now if a simple FixedOffsetTimeZone class was added and we just pre-populated the datetime module with a utc attribute that contained an instance of that class set to the proper values for UTC, that I could support without controversy. That would get you your "singleton" by reliably using the same instance without having to try to hack in singleton support.

    abalkin commented 14 years ago

    .. Thanks for the explanation. I realize that I should not have used the s-word. :-) In fact I only wanted a module level constant utc = UTC() and did not care much about other UTC class instances and whether any are permitted or easy to create.

    Well, the datetime module is not exactly the place you want to start if you want to lead anyone to best Python practices. :-) (Just think of datetime subclassing from date!)

    Now if a simple FixedOffsetTimeZone class was added and we just pre-populated the datetime module with a utc attribute that contained an instance of that class set to the proper values for UTC, that I could support without controversy.

    This is exactly my preferred solution.

    3aeecd78-b2cd-4760-a6cc-dae21f06c5b2 commented 14 years ago

    "Note that I am not suggesting passing anything to utcnow(). I would either leave it unchanged or make it always return aware datetime instances."

    FYI, all other issues aside, having utcnow() (with no parameters) return a tzaware instance will introduce backward compatibility problems. As it is, users are not expecting utcnow to return a date-time with any tzinfo.

    abalkin commented 14 years ago

    Roundup bug bites again. Reposting via web:

    ----- On Tue, May 25, 2010 at 5:35 PM, Brett Cannon \report@bugs.python.org\ wrote:

    The singleton dislike from Antoine and me is that they are generally just not liked in the stdlib. .. Thanks for the explanation. I realize that I should not have used the s-word. :-) In fact I only wanted a module level constant utc = UTC() and did not care much about other UTC class instances and whether any are permitted or easy to create.

    .. so we need to go with what we consider best practice for Python since it will lead to much more use than pytz gets.

    Well, the datetime module is not exactly the place you want to start if you want to lead anyone to best Python practices. :-) (Just think of datetime subclassing from date!)

    Now if a simple FixedOffsetTimeZone class was added and we just pre-populated the datetime module with a utc attribute that contained an instance of that class set to the proper values for UTC, that I could support without controversy.

    This is exactly my preferred solution.

    abalkin commented 14 years ago

    Note that Brett has already mentioned backward compatibility issues, but suggested that "[adding tz_aware argument may provide] a transition plan to make utcnow() always return a timezone-aware datetime object." [msg106413]

    I would say, lets leave utcnow() alone. It is ugly enough without a boolean argument. I don't see how datetime.now(utc) can be too error prone. I think Brett's comment about a stupid developer was about passing tzinfo instead of bool to utcnow() and that I agree makes no sense.

    brettcannon commented 14 years ago

    If we don't modify utcnow (and similar UTC methods) to take a flag saying to use the utc instance, then the methods should at least get deprecated with a message saying that people should be using now(utc), etc.

    abalkin commented 14 years ago

    Brett: "[utcnow] should at least get deprecated with a message saying that people should be using now(utc)"

    Yes, I believe all utcxxx methods of datetime are a kludge due to the lack of concrete UTC tzinfo:

    utcfromtimestamp() -> fromtimestamp(utc) t.utctimetuple() -> t.replace(tzinfo=utc).timetuple()

    brettcannon commented 14 years ago

    OK, it looks like we are reaching consensus here on several points:

    1. Implement FixedOffsetTimezone
    2. Provide a utc attribute on the datetime module that is set to FixedOffsetTimezone(0, "UTC")
    3. Deprecate the various utc* methods with messages pointing out how to use the new utc instance instead of the method

    If this seems reasonable, then I see two questions to answer.

    First is how long to do the deprecations. I say remove in Python 3.5. Existing for three versions is six more years of usage from the time of 3.2's release to that of 3.6. Plus it is easy to be backwards-compatible by showing in the docs how to create one's own UTC class.

    The second is whether we should take this opportunity to fix datetime being a C extension module exclusively. I know PyPy has their own pure Python version of datetime that they plan to eventually contribute. We might as well use this as the chance to create Lib/datetime.py and have that conditionally import _datetimemodule.c (see the heapq module on how to handle this kind of situation). That way PyPy can eventually just drop their code into datetime.py. Biggest issue will be extension modules wanting to use the C extension API, but since this is new stuff it shouldn't affect them except for the module renaming.

    abalkin commented 14 years ago

    I have no opinion on the first question. I would be fine with a simple "soft" deprecation where we simply add a note in documentation warning that these methods create naive datetime instances and it is preferable to use aware variants produced by meth(utc). On the other hand eventually removing these methods will make maintenance easier. Sorry I cannot offer more help with this decision.

    With respect to the second question, I would be against mixed C/Python implementation. I would also like to see C API to the new concrete tzinfo classes.

    brettcannon commented 14 years ago

    The stated long-term goal of the stdlib is to minimize the C extension modules to only those that have to be written in C (modules can still have performance enhancing extension back-ends). Since datetime does not meet that requirement it's not a matter of "if" but "when" datetime will get a pure Python version and use the extension code only for performance.

    If someone wants to implement the C code for a tzinfo concrete class that we are talking about, that's fine. But that will not prevent datetime from getting a pure Python version at some point.

    pitrou commented 14 years ago

    The second is whether we should take this opportunity to fix datetime being a C extension module exclusively. I know PyPy has their own pure Python version of datetime that they plan to eventually contribute. We might as well use this as the chance to create Lib/datetime.py and have that conditionally import _datetimemodule.c (see the heapq module on how to handle this kind of situation).

    Of we could let PyPy contribute their own version when they want, after all. I think additions to the datetime API are good in themselves, and we shouldn't make them depend on some mythical refactoring of the code into a separate pure Python module ;)

    brettcannon commented 14 years ago

    PyPy has said over the years they plan to commit their version of datetime, they just need to get around to it. I just figured that we could use this opportunity to prepare for it. But if people want to do the C version first, that's fine as they will be the ones writing the patch. I just thought that if someone would rather write a patch to create datetime.py now along with a pure Python version of the proposed class they could.

    abalkin commented 14 years ago

    Here is my first attempt to implement fixed offset timezone type. The patch is based on Brett's next-patch.txt and while I changed the type name from datetime.UTC to datetime.timezone, I did not change the name of the related C structures. I would like to ask for comments on the following questions:

    1. How to call the new type? I like "timezone" because it is likely to be the only concrete tzinfo subclass in datetime module, so we don't really need to call it fixedoffsetfromutctimezone.

    2. Do we want to add a dst indicator and altname attributes? I would say: no. I would rather treat DST as a different fixed offset timezone.

    3. I am not quite happy about having to specify offset in minutes. I think timezone(hours[, minutes]) may be clearer. Alternatively we may just take offset as a timedelta. Note bpo-5288. There is some interest in supporting sub-minute timezones.

    4. I have fixed a reference leak in utcnow, but I am still against giving it tz_aware argument.

    brettcannon commented 14 years ago

    My thoughts on Alexander's questions:

    1. Call it FixedTimezone or something (remember it has to be CapWords). Calling it simply Timezone does not convey the fact that DST is not supported and people might naively think it will. Its limited abilities should be portrayed in the name.

    2. Keep the class dead-simple. The primary motivator is to support UTC, maybe the %z directive for strptime. Otherwise everything else should be left out of the stdlib and let third-parties manage it as they will be the ones that need to manage the bazillion timezone instances they need. We don't need to dictate an interface to them.

    3. Taking a timedelta makes sense since the class represents the fixed time offset from UTC.

    As for the tz_aware argument to utcnow and friends, I am fine with letting go of them if we have a utc attribute on datetime and we simply document that to get a UTC-aware value do now(datetime.utc) and consider deprecating utcnow.

    abalkin commented 14 years ago

    On Wed, Jun 2, 2010 at 5:24 PM, Brett Cannon \report@bugs.python.org\ wrote: ..

    1. Call it FixedTimezone or something (remember it has to be CapWords).

    I thought consistency within module trumps PEP-8 naming standards. The datetime module (for better or worse) uses lowercase names for its types: date, time, datetime, tzinfo. Shouldn't the new type follow the suit? (This will also avoid a source of typos TimeZone vs. TimeZone).

    I don't like "fixed timezone" - it is not clear what it is fixed: offset, geographical location or historical set of rules. I think we should promote the notion that timezone is just an offset. EST is -5 hours, EDT is -4. New York uses EST in winter and EDT in summer. A zoneinfo database (external to python) is a mapping from place and time to timezone.

    brettcannon commented 14 years ago

    Forgot about datetime breaking the PEP-8 rules. You're right, consistency wins.

    As for fixedtimezone being odd, that's why my mind went with FixedOffsetTimezone to start, but that doesn't go with the naming of the module, and fixedoffsettimezone is just hard to read.

    As long as the documentation is VERY clear that the timezone class only supports a fixed offset from UTC and nothing else I can live with the "timezone".

    abalkin commented 14 years ago

    I am attaching the next installment of the datetime.timezone class implementation.

    Here I add utc class attribute to timezone. I decided to place it in class rather than module namespace because this seems to be more inline with how datetime module defines particular instances of its classes such as min, max and resolution. I also feel that writing timezone.utc makes it clearer that its is an instance of timezone class while datetime.UTC or simply UTC is more ambiguous.

    I also changed timezone constructor to interpret int or float offset as number of hours and accept arbitrary timedelta between timedelta(hours=-12) and timedelta(hours=12). The rationale is that most common timezones have offsets at whole hours and less common but existing timezones use 1/2 or 1/4 hour offsets and thus can be specified as a binary float without any issue.

    I've added tests and some preliminary documentation.

    brettcannon commented 14 years ago

    I don't think people would get confused as to what datetime.utc was, but as you pointed out, Alexander, the module seems to like class attributes so timezone.utc is fine.

    As for the float/int argument, I personally am wary of it. Since the timedelta constructor accepts hours as a keyword argument, I don't see the benefit of having to support both timedeltas and int/floats. And I can see someone messing up and putting in a float that is not perfectly representable and getting upset at odd behavior. I say keep it simple and just accept timedeltas for now. If there really is demand for accepting integers in the constructor than it can be added without backwards-compatibility issues. Better to keep the API small and expand later than make it too big to start and being burdened with extraneous API stuff.

    mdickinson commented 14 years ago

    accept arbitrary timedelta between timedelta(hours=-12) and timedelta(hours=12)

    Aren't there valid timezones that are offset by more than 12 hours from UTC?

    abalkin commented 14 years ago

    On Thu, Jun 3, 2010 at 3:19 PM, Mark Dickinson \report@bugs.python.org\ wrote: ..

    Aren't there valid timezones that are offset by more than 12 hours from UTC?

    I am not sure. At this stage treat 12 as a placeholder for whatever the relevant standard says. I've seen suggestions that the range should be (-24, 24) excluding ends \http://www.zope.org/Members/fdrake/DateTimeWiki/BasicDesign\ and [-14, 14] \http://msdn.microsoft.com/en-us/library/bb630289.aspx\.

    abalkin commented 14 years ago

    On Thu, Jun 3, 2010 at 3:41 PM, Alexander Belopolsky \report@bugs.python.org\ wrote: ..

    I am not sure.   At this stage treat 12 as a placeholder for whatever the relevant standard says.

    Believe it or not, at least one standard, RFC 2822, allows any offset representable as HHMM: "the zone MUST be within the range -9959 through +9959" \http://tools.ietf.org/html/rfc2822.html\.

    I am inclined to simply remove any range checking and allow arbitrary timedelta as an offset.

    abalkin commented 14 years ago

    On Thu, Jun 3, 2010 at 3:15 PM, Brett Cannon \report@bugs.python.org\ wrote: ..

    As for the float/int argument, I personally am wary of it. Since the timedelta constructor accepts hours as a keyword argument, I don't see the benefit of having to support both timedeltas and int/floats.

    To my taste, timedelta(hours=-5) is just too verbose. It also forces you to import timedelta in a module that may otherwise not need it.

    And I can see someone messing up and putting in a float that is not > perfectly representable and getting upset at odd behavior.

    Since timedelta accepts floats for hours, this argument does not really hold. Also, with timedelta correctly rounding to the nearest microsecond, it is really hard to mess up with binary vs. decimal representation.

    I say keep it simple and just accept timedeltas for now. If there really is demand for accepting integers in the constructor than it can > be added without backwards-compatibility issues. Better to keep the > API small and expand later than make it too big to start and being burdened with extraneous API stuff.

    This a valid argument. since it is easier to remove code than to add, I'll keep int/float support in the patch while we are discussing the design, but I am ready to remove it.

    abalkin commented 14 years ago

    I am having second thoughts about dst indicator. I wrote: """

    1. Do we want to add a dst indicator and altname attributes? I would say: no. I would rather treat DST as a different fixed offset timezone. """

    and Brett responded: """

    1. Keep the class dead-simple. The primary motivator is to support UTC, maybe the %z directive for strptime. Otherwise everything else should be left out of the stdlib and let third-parties manage it as they will be the ones that need to manage the bazillion timezone instances they need. We don't need to dictate an interface to them. """

    Now note, that with fixed offset timezone class, it is possible to produce aware local times as follows:

    from datetime import datetime, timezone, timedelta
    import time
    EPOCH = datetime(1970, 1, 1)
    def localtime(utctime=None):
        if utctime is None:
            tm = time.localtime()
        else:
            seconds = (utctime - EPOCH).total_seconds()
            tm = time.localtime(seconds)
    
        tz = (timezone(timedelta(seconds=-time.altzone), time.tzname[1])
              if tm.tm_isdst else
              timezone(timedelta(seconds=-time.timezone), time.tzname[0]))
        return datetime(*tm[:6], tzinfo=tz)

    (see also attached localtime.py)

    The problem with the above implementation is that t.timetuple().tm_isdst will always be 0 if t is produced by localtime().

    I don't think adding fixed dst offset is much of complication. We already need to override the tzinfo.dst method and if we only allow timedeltas as offset and dst arguments to constructor, the constructor code will be extremely simple.

    brettcannon commented 14 years ago

    So you want a third argument that lets you flag if the timezone is DST or not? I still don't think that will be necessary. If people want to add that they can very easily subclass the timezone class and add support for it. This class should be focused on providing a UTC instance and anything needed for a %z directive in strptime, nothing more. Anything fancier can be handled by libraries like pytz as they need it. Once again, keep the APi as simple as possible and add features as needed. I know how tempting it it is to design upfront, but just trust me, Alexander, we will all get burned for it later.

    abalkin commented 14 years ago

    So you want a third argument that lets you flag if the timezone is DST or not?

    The third argument is not a flag, it is a timedelta just like the offset. I am attaching a python prototype for clarity. (See datetimeex.py.)

    Conceptually, a 3-argument timezone is very simple: tzinfo defines three abstract methods: utcoffset(..), tzname(..), and dts(..). The proposed concrete implementation lets allows the user to provide constant values to be returned from each of these methods.

    brettcannon commented 14 years ago

    I'm still leary of supporting any form of DST. A proper DST implementation would need to have some conditional code to account for the datetime object passed into dst, and yet the version you have prototyped doesn't handle it. So a proper timezone supporting DST would still need to subclass any concrete class.

    I still say keep it as simple as possible and let users subclass as needed to add DST support. Subclassing __init__ and dst() is not difficult if you want to add proper DST support, especially if dst() is set to return timdelta(0) and utcoffset() always returns CONSTANT + self.dst().

    And just to mention it, the instance attributes you had in your example, Alexander, were not "private". For any final code, make sure you make them private else you are asking for trouble from people starting to rely on those attributes.

    abalkin commented 14 years ago

    On Thu, Jun 3, 2010 at 3:19 PM, Mark Dickinson \report@bugs.python.org\ wrote: ..

    Aren't there valid timezones that are offset by more than 12 hours from UTC?

    Indeed, Christmas Island uses UTC+14. (http://en.wikipedia.org/wiki/Kiritimati).

    The most western timezone seems to be UTC-12 used on two uninhabited islands. \http://en.wikipedia.org/wiki/Time_zone#Time_zone_as_offsets_from_UTC\

    The tzinfo specification requires [-24, 24] hours range:

    """ .. the value returned must be a timedelta object specifying a whole number of minutes in the range -1439 to 1439 inclusive (1440 = 24*60; the magnitude of the offset must be less than one day). """ -- http://docs.python.org/dev/py3k/library/datetime.html#datetime.tzinfo.utcoffset

    I am torn between two options with a slight preference for the first:

    1. Don't do any checking in the constructor and allow any timedelta used as an offset. This is the simplest to implement and most future proof. For example, it may be desirable to extend [-24, 24] to at least [-99, 99] to allow round-tripping of compliant RFC 3339 timestamps. (Note that I am not suggesting that real life more than a day offsets are possible, but once a standard allows impossible values, people tend to abuse them as special markers in their data.)

    2. Require [-24, 24] hours range. This is the letter of the current tzinfo.utcoffset() definition.

    Opinions?

    What do you think

    abalkin commented 14 years ago

    Merging in bpo-7584 nosy list.

    abalkin commented 14 years ago

    I'm still leary of supporting any form of DST. A proper DST implementation would need to have some conditional code to account for the datetime object passed into dst, and yet the version you have prototyped doesn't handle it.

    No, any tzinfo implementation where utcoffset(dt) depends on dt is broken because once utcoffset starts to vary with time you can no longer determine a point in time from the local time alone. (In theory, a continuously increasing or decreasing in time utcoffset is an exception to this rule, but there is no practical use for those.)

    This limitation is admitted in datetime.tzinfo documentation:

    """ Note that there are unavoidable subtleties twice per year in a tzinfo subclass accounting for both standard and daylight time, at the DST transition points. ... Applications that can’t bear such ambiguities should avoid using hybrid tzinfo subclasses; there are no ambiguities when using UTC, or any other fixed-offset tzinfo subclass (such as a class representing only EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)). """

    (See three paragraphs above http://docs.python.org/dev/py3k/library/datetime.html#strftime-and-strptime-behavior)

    In pytz, \http://pytz.sourceforge.net/#tzinfo-api\, tzinfo API is extended to add an is_dst flag to utcoffset(), tzname(), and dst() methods, but since datetime objects do not carry this flag, it is impossible for datetime module to pass this flag to timezone within datetime.datetime methods and datetime module does not know about this flag to begin with.

    To add insult to injury, the extended API still does not solve all the problems: \http://pytz.sourceforge.net/#problems-with-localtime\.

    So a proper timezone supporting DST would still need to subclass any concrete class.

    No, as I explained above, it is not possible to implement a "proper timezone." I believe most of the frustration with the current tzinfo API stems from the fact that it is not implementable. The correct interface to a timezone database should provide a mapping from (universal time, geographical location) to civil time there and then. A common name for the timezone in use and information about DST being in effect is useful for interoperability but not strictly required.

    This is what I implemented in my localtime() prototype in localtime.py (loosing DST information) and datetimeex.py (interoperable with POSIX timetupe based interfaces and pytz extended API).

    Note that on systems supporting extended tm structure (with tm_zone and tm_gmtoff fields), it is possible to implement localtime() which will take advantage of the full historical timezone information available on the system.