python / cpython

The Python programming language
https://www.python.org
Other
62.59k stars 30.04k forks source link

add cross-platform support for %s strftime-format code #56959

Open bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e opened 13 years ago

bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago
BPO 12750
Nosy @tim-one, @abalkin, @vstinner, @rbtcollins, @bitdancer, @4kir4, @pganssle, @adamwill
Files
  • strftime.patch: patch for strftime("%s")
  • strftime2.patch: rounding problem fixed with math.floor
  • strftime3.patch: more tests and some documentation added.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/abalkin' closed_at = None created_at = labels = ['extension-modules', 'easy', 'type-feature', 'library'] title = 'add cross-platform support for %s strftime-format code' updated_at = user = 'https://bugs.python.org/DanielOConnor' ``` bugs.python.org fields: ```python activity = actor = 'p-ganssle' assignee = 'belopolsky' closed = False closed_date = None closer = None components = ['Extension Modules', 'Library (Lib)'] creation = creator = "Daniel.O'Connor" dependencies = [] files = ['27231', '35785', '35816'] hgrepos = [] issue_num = 12750 keywords = ['patch', 'easy'] message_count = 29.0 messages = ['142095', '142129', '142130', '142131', '142150', '142190', '142244', '142245', '142249', '142250', '170808', '221353', '221385', '221386', '221602', '221606', '221620', '221622', '221868', '221872', '221873', '221875', '221877', '222030', '222047', '225643', '247372', '270533', '313176'] nosy_count = 13.0 nosy_names = ['tim.peters', 'belopolsky', 'vstinner', 'rbcollins', 'r.david.murray', 'santoso.wijaya', 'akira', 'bignose', "Daniel.O'Connor", 'mumino', 'shanmbic', 'p-ganssle', 'adamwill'] pr_nums = [] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue12750' versions = ['Python 3.6'] ```

    bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago

    It isn't possible to add a timezone to a naive datetime object which means that if you are getting them from some place you can't directly control there is no way to set the TZ.

    eg pywws' DataStore returns naive datetime's which are in UTC. There is no way to set this and hence strftime seems to think they are in local time.

    I can sort of see why you would disallow changing a TZ once set but it doesn't make sense to prevent this for naive DTs.

    Also, utcnow() returns a naive DT whereas it would seem to be more sensible to return it with a UTC TZ.

    bitdancer commented 13 years ago

    In what way does 'replace' not satisfy your need to set the tzinfo?

    As for utcnow, we can't change what it returns for backward compatibility reasons, but you can get a non-naive utc datatime by doing datetime.now(timezone.utc). (I must admit, however, that at least this morning I can't wrap my head around how that works based on the docs :(.

    bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago

    On 15/08/2011, at 23:39, R. David Murray wrote:

    R. David Murray \rdmurray@bitdance.com\ added the comment:

    In what way does 'replace' not satisfy your need to set the tzinfo?

    Ahh that would work, although it is pretty clumsy since you have to specify everything else as well.

    In the end I used calendar.timegm (which I only found out about after this).

    As for utcnow, we can't change what it returns for backward compatibility reasons, but you can get a non-naive utc datatime by doing ´

    That is a pity :(

    datetime.now(timezone.utc). (I must admit, however, that at least this morning I can't wrap my head around how that works based on the docs :(.

    OK.. I am only using 2.7 so I can't try that :)

    ---------- nosy: +r.david.murray


    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue12750\


    bitdancer commented 13 years ago

    Ah. Well, pre-3.2 datetime itself did not generate *any* non-naive datetimes.

    Nor do you need to specify everything for replace. dt.replace(tzinfo=tz) should work just fine.

    bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago

    On 16/08/2011, at 1:06, R. David Murray wrote:

    R. David Murray \rdmurray@bitdance.com\ added the comment:

    Ah. Well, pre-3.2 datetime itself did not generate *any* non-naive datetimes.

    Nor do you need to specify everything for replace. dt.replace(tzinfo=tz) should work just fine.

    OK.

    I did try this and it seems broken though.. In [19]: now = datetime.datetime.utcnow()

    In [21]: now.replace(tzinfo = pytz.utc) Out[21]: datetime.datetime(2011, 8, 15, 22, 54, 13, 173110, tzinfo=\<UTC>)

    In [22]: datetime.datetime.strftime(now, "%s") Out[22]: '1313414653'

    In [23]: now Out[23]: datetime.datetime(2011, 8, 15, 22, 54, 13, 173110)

    [ur 8:22] ~ >date -ujr 1313414653 Mon 15 Aug 2011 13:24:13 UTC

    i.e. it appears that replace() applies the TZ offset to a naive datetime object effectively assuming it is local time rather than un-timezoned (which is what the docs imply to me)

    ---------- resolution: -> invalid stage: -> committed/rejected status: open -> closed


    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue12750\


    bitdancer commented 13 years ago

    OK. At a minimum there is a doc issue here, so I'm reopening.

    abalkin commented 13 years ago

    i.e. it appears that replace() applies the TZ offset to a naive datetime object effectively assuming it is local time rather than un-timezoned (which is what the docs imply to me)

    I don't understand your issue. The replace method does not assume anything, it just replaces whatever fields you specify with new values. You can replace tzinfo just like any other field, year, month, day, etc while preserving the other fields. I think this is fairly well documented. I think what you are looking for is the astimezone() method which, however may not work well on naive datetime instances simply because a naive instance may be ambiguous in presence of DST. However, if you start with an aware UTC datetime, you should be able to use astimezone() to convert to any local TZ.

    bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago

    On 17/08/2011, at 10:30, Alexander Belopolsky wrote:

    Alexander Belopolsky \alexander.belopolsky@gmail.com\ added the comment:

    > i.e. it appears that replace() applies the TZ offset to a naive datetime > object effectively assuming it is local time rather than un-timezoned > (which is what the docs imply to me)

    I don't understand your issue. The replace method does not assume anything, it just replaces whatever fields you specify with new values. You can replace tzinfo just like any other field, year, month, day, etc while preserving the other fields. I think this is fairly well documented. I think what you are looking for is the astimezone() method which, however may not work well on naive datetime instances simply because a naive instance may be ambiguous in presence of DST. However, if you start with an aware UTC datetime, you should be able to use astimezone() to convert to any local TZ.

    Hmm I see, it would appear the problem lies with strftime().

    [ur 10:34] ~ >ipython-2.7 Python 2.7.2 (default, Aug 6 2011, 23:46:16) Type "copyright", "credits" or "license" for more information. IPython 0.10.2 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more.

    In [48]: now = datetime.datetime.utcnow() In [49]: nowtz = now.replace(tzinfo = pytz.utc) In [50]: nowadl = now.replace(tzinfo = pytz.timezone('Australia/Adelaide')) In [51]: now Out[51]: datetime.datetime(2011, 8, 17, 1, 53, 51, 451118) In [52]: nowtz Out[52]: datetime.datetime(2011, 8, 17, 1, 53, 51, 451118, tzinfo=\<UTC>) In [53]: nowadl Out[53]: datetime.datetime(2011, 8, 17, 1, 53, 51, 451118, tzinfo=\<DstTzInfo 'Australia/Adelaide' CST+9:30:00 STD>) In [54]: now.strftime("%F %r %s") Out[54]: '2011-08-17 01:53:51 AM 1313511831' In [55]: nowtz.strftime("%F %r %s") Out[55]: '2011-08-17 01:53:51 AM 1313511831' In [56]: nowadl.strftime("%F %r %s") Out[56]: '2011-08-17 01:53:51 AM 1313511831'

    Wed 17 Aug 2011 01:54:52 UTC [ur 11:24] ~ >date +%s 1313546093 [ur 11:24] ~ >date -ujr date +%s Wed 17 Aug 2011 01:54:59 UTC [ur 11:24] ~ >date -ujr 1313511831 Tue 16 Aug 2011 16:23:51 UTC

    i.e. strftime disregards tzinfo and seems to treat the time as LT (I think).

    It certainly doesn't behave the way I'd expect after using strftime(3) et al :)

    abalkin commented 13 years ago

    it would appear the problem lies with strftime()

    Yes, strftime('%s') ignores tzinfo at the moment. This is not a bug. Support for '%s' format code is incidental and not documented in Python.

    Nevertheless I think this is a good feature request. I am changing the title to make it more explicit.

    bf9d26e4-19d5-4b85-9475-b6cd1d5c5b5e commented 13 years ago

    On 17/08/2011, at 12:42, Alexander Belopolsky wrote:

    Alexander Belopolsky \alexander.belopolsky@gmail.com\ added the comment:

    > it would appear the problem lies with strftime()

    Yes, strftime('%s') ignores tzinfo at the moment. This is not a bug. Support for '%s' format code is incidental and not documented in Python.

    Nevertheless I think this is a good feature request. I am changing the title to make it more explicit.

    OK thanks!

    3e695de2-89f5-4c4a-ac04-3e80e7ab217c commented 12 years ago

    I made a patch for datetime.strftime('%s'). it takes tzinfo into consideration.

    >>> datetime.datetime(1970,1,1).strftime("%s")   
    '-7200'
    
    >>> datetime.datetime(1970,1,1, tzinfo=datetime.timezone.utc).strftime("%s")
    '0'
    
    datetime.date still behave as naive datetime.datetime
    >>> datetime.date(1970,1,1).strftime("%s")
    '-7200'
    abalkin commented 10 years ago

    I would like to hear from others on this feature. One concern that I have is whether it is wise to truncate the fractional seconds part in '%s'. Also, if we support '%s' in strftime we should probably support it in strptime as well.

    7fe5d93b-2a2c-46a0-b5cd-5602c591856a commented 10 years ago

    *If* the support for %s strftime format code is added then it should keep backward compatibility on Linux, OSX: it should produce an integer string with the correct rounding.

    Currently, datetime.strftime delegates to a platform strftime(3) for format specifiers that are not described explicitly 1:

    The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common. To see the full set of format codes supported on your platform, consult the strftime(3) documentation.

    %s is not defined in C, POSIX but is already defined on Linux, BSD 2 where datetime.now().strftime('%s') can print an integer timestamp.

    %s is replaced by the number of seconds since the Epoch, UTC (see mktime(3)).

    Unsupported format code is *undefined behavior* (crash, launch a missile is a valid behavior) otherwise.

    Support for additional codes on some platforms is explicitly mentioned in datetime docs therefore %s behavior shouldn't change if it is well-defined on a given platform i.e., datetime.now().strftime('%s') should keep producing an integer string on Linux, BSD.

    '%d' produces the wrong rounding on my machine:

      >>> from datetime import datetime, timezone
      >>> dt = datetime(1969, 1, 1, 0,0,0, 600000, tzinfo=timezone.utc)
      >>> '%d' % dt.timestamp()
      '-31535999'
      >>> dt.astimezone().strftime('%s')
      '-31536000'

    math.floor could be used instead:

      >>> '%d' % math.floor(dt.timestamp())
      '-31536000'

    There is no issue with the round-trip via a float timestamp for datetime.min...datetime.max range on my machine. calendar.timegm could be used to avoid floats if desired:

      >>> import calendar
      >>> calendar.timegm(dt.astimezone(timezone.utc).timetuple())
      -31536000

    Note: dt.utctimetuple() is not used to avoid producing the wrong result silently if dt is a naive datetime object; an exception is raised instead.

    The result is equivalent to time.strftime('%s', dt.astimezone().timetuple()) (+/- date/time range issues).

    ---

    It is not clear what the returned value for %s strptime should be: naive or timezone-aware datetime object and what timezone e.g.,

    The result is an aware datetime object in UTC timezone.

    abalkin commented 10 years ago

    It is not clear what the returned value for %s strptime should be:

    I would start conservatively and require %z to be used with %s. In this case, we can easily produce aware datetime objects.

    I suspect that in the absence of %z, the most useful option would be to return naive datetime in the local timezone, but that can be added later.

    7fe5d93b-2a2c-46a0-b5cd-5602c591856a commented 10 years ago

    I suspect that in the absence of %z, the most useful option would be to return naive datetime in the local timezone, but that can be added later.

    Naive datetime in the local timezone may lose information that is contained in the input timestamp:

      >>> import os
      >>> import time
      >>> from datetime import datetime
      >>> import pytz
      >>> os.environ['TZ'] = ':America/New_York'
      >>> time.tzset()
      >>> naive_dt = datetime(2014, 11, 2, 1, 30)
      >>> naive_dt.timestamp()
      1414906200.0
      >>> naive_dt.strftime('%s')
      '1414906200'
      >>> pytz.timezone('America/New_York').localize(naive_dt, is_dst=False).timestamp()
      1414909800.0
      >>> pytz.timezone('America/New_York').localize(naive_dt, is_dst=True).timestamp()
      1414906200.0
      >>> pytz.timezone('America/New_York').localize(naive_dt, is_dst=None)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "~/.virtualenvs/py3.4/lib/python3.4/site-packages/pytz/tzinfo.py", line 349, in localize
          raise AmbiguousTimeError(dt)
      pytz.exceptions.AmbiguousTimeError: 2014-11-02 01:30:00

    1414906200 timestamp corresponds to 2014-11-02 01:30:00-04:00 but datetime(2014, 11, 2, 1, 30) along is ambiguous -- it may correspond to both 1414906200 and 1414909800 if local timezone is America/New_York.

    It would be nice if datetime.strptime() would allow the round-trip whatever the local timezone is:

    >>> ts = '1414906800' >>> datetime.strptime(ts, '%s').strftime('%s') == ts

    it is possible if strptime() returns timezone-aware datetime object.

    3e695de2-89f5-4c4a-ac04-3e80e7ab217c commented 10 years ago

    I added an improved patch according to akira's explanation for strftime and rounding problem.

    abalkin commented 10 years ago

    On the second thought, I don't think accepting this should be contingent on any decision with respect to strptime.

    abalkin commented 10 years ago

    rounding problem fixed with math.floor

    Can you explain why math.floor rather than builtin round is the correct function to use?

    7fe5d93b-2a2c-46a0-b5cd-5602c591856a commented 10 years ago

    Can you explain why math.floor rather than builtin round is the correct function to use?

    To avoid breaking existing scripts that use .strftime('%s') on Linux, OSX, see msg221385:

      >>> from datetime import datetime, timezone
      >>> dt = datetime(1969, 1, 1, 0,0,0, 600000, tzinfo=timezone.utc)
      >>> '%d' % dt.timestamp()
      '-31535999'
      >>> round(dt.timestamp())
      -31535999
      >>> dt.astimezone().strftime('%s') # <-- existing behavior
      '-31536000'
      >>> '%d' % math.floor(dt.timestamp())
      '-31536000'
      >>> import calendar
      >>> calendar.timegm(dt.astimezone(timezone.utc).timetuple())
      -31536000
    abalkin commented 10 years ago

    Here is the simpler demonstration of the "floor" behavior on Linux:

    >>> from datetime import datetime
    >>> datetime.fromtimestamp(-0.1).strftime('%s')
    '-1'
    >>> datetime.fromtimestamp(-1.1).strftime('%s')
    '-2'
    >>> datetime.fromtimestamp(0.1).strftime('%s')
    '0'
    >>> datetime.fromtimestamp(1.1).strftime('%s')
    '1'
    abalkin commented 10 years ago

    Could you, please add tests for non-fixed offset timezones? There are several defined in datetimetester.py already.

    abalkin commented 10 years ago

    The patch should update documentation.

    See https://docs.python.org/3.5/library/datetime.html#strftime-and-strptime-behavior

    abalkin commented 10 years ago

    + t = datetime(1969, 1, 1, 0,0,0, 600000, tzinfo=timezone.utc)

    Please add spaces after commas.

    3e695de2-89f5-4c4a-ac04-3e80e7ab217c commented 10 years ago

    more tests and some documentation added.

    7fe5d93b-2a2c-46a0-b5cd-5602c591856a commented 10 years ago

    %s format code behaviour was undefined and incidental.

    strftime('%s') is not portable but it *is supported on some platforms i.e., it is *not undefined and it is *not incidental on these platforms. datetime.strftime *delegates to the platform strftime(3) and some platforms do support %s format code. See the quote from the datetime docs in msg221385.

    It would be preferable that datetime.strftime would reject format codes that it doesn't support explicitly (like datetime.strptime does) so that datetime.strftime were portable but that ship has sailed.

    This issue could be titled: add cross-platform support for %s strftime-format code (and fix its behavior (add support) for timezone-aware datetime objects).

    ---

    If the implementation uses floats to get an integer result; it should have tests for edge cases (datetime.min, datetime.max at least). I don't see such tests, please, correct me if I'm wrong.

    7fe5d93b-2a2c-46a0-b5cd-5602c591856a commented 10 years ago

    bpo-22246 discusses the reverse: strptime('12345', '%s')

    rbtcollins commented 9 years ago

    Moving this back to patch needed: the patch was reviewed by a committer and changes requested.

    abalkin commented 8 years ago

    Given that we have the .timestamp() method, I am not sure this would be a very useful feature, but maybe it is a way to eliminate an attractive nuisance.

    If anyone is still interested in getting this in - please check with python-ideas.

    cdfdacf7-cfdb-4905-b7b7-00cb58fbbe82 commented 6 years ago

    On the "attractive nuisance" angle: I just ran right into this problem, and reported https://bugs.python.org/issue32988 .

    As I suggested there, if Python doesn't try to fix this, I'd suggest it should at least *explicitly document that using %s is unsupported and dangerous in more than one way (might not work on all platforms, does not do what it should for 'aware' datetimes on platforms where it *does work). I think explicitly telling people NOT to use it would be better than just not mentioning it. At least for me, when I saw real code using it and that the docs just didn't mention it, my initial thought was "I guess it must be OK, and the docs just missed it out for some reason". If I'd gone to the docs and seen an explicit note that it's not supported and doesn't work right, that would've been much clearer and I wouldn't have had to figure that out for myself :)

    For Python 2, btw, the arrow library might be a suitable alternative to suggest: you can do something like this, assuming you have an aware datetime object called 'awaredate' you want to get the timestamp for:

    import arrow
    ts = arrow.get(awaredate).timestamp

    and it does the right thing.