python / cpython

The Python programming language
https://www.python.org/
Other
60.54k stars 29.26k forks source link

datetime needs an "epoch" method #46988

Closed daf46b87-a9e9-4381-bf23-8c373c9135e6 closed 13 years ago

daf46b87-a9e9-4381-bf23-8c373c9135e6 commented 16 years ago
BPO 2736
Nosy @malemburg, @tim-one, @jribbens, @amauryfa, @mdickinson, @abalkin, @pitrou, @catlee, @vstinner, @bitdancer
Files
  • add-datetime-totimestamp-method.diff: Implementation of datetime.datetime.timetuple and tests.
  • add-datetime-totimestamp-method-docs.diff
  • datetime_totimestamp-3.patch
  • issue2736-doc.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/abalkin' closed_at = created_at = labels = ['type-feature', 'docs'] title = 'datetime needs an "epoch" method' updated_at = user = 'https://github.com/tebeka' ``` bugs.python.org fields: ```python activity = actor = 'belopolsky' assignee = 'belopolsky' closed = True closed_date = closer = 'belopolsky' components = ['Documentation'] creation = creator = 'tebeka' dependencies = [] files = ['10251', '10256', '12329', '21565'] hgrepos = [] issue_num = 2736 keywords = ['patch'] message_count = 67.0 messages = ['66045', '66140', '66532', '66539', '66601', '66610', '75723', '75899', '75900', '75902', '75903', '75904', '75912', '76003', '76324', '76327', '76329', '76331', '76332', '76340', '76344', '76345', '76351', '76352', '77650', '77651', '99545', '103875', '106229', '106230', '106249', '106251', '106252', '106254', '106255', '124197', '124203', '124204', '124225', '124230', '124231', '124237', '124245', '124248', '124252', '124255', '124256', '124257', '124259', '132695', '132697', '132818', '132977', '132994', '133008', '133009', '133011', '133037', '133039', '133053', '133056', '133058', '133072', '133207', '133245', '134395', '162533'] nosy_count = 23.0 nosy_names = ['lemburg', 'tim.peters', 'ping', 'jribbens', 'guettli', 'amaury.forgeotdarc', 'mark.dickinson', 'davidfraser', 'belopolsky', 'pitrou', 'andersjm', 'catlee', 'vstinner', 'tomster', 'werneck', 'hodgestar', 'Neil Muller', 'erik.stephens', 'steve.roberts', 'r.david.murray', 'vivanov', 'python-dev', 'Jay.Taylor'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue2736' versions = ['Python 3.3'] ```

    daf46b87-a9e9-4381-bf23-8c373c9135e6 commented 16 years ago

    If you try to convert datetime objects to seconds since epoch and back it will not work since the microseconds get lost:

    >>> dt = datetime(2008, 5, 1, 13, 35, 41, 567777)
    >>> seconds = mktime(dt.timetuple())
    >>> datetime.fromtimestamp(seconds) == dt
    False
    
    Current fix is to do
    >>> seconds += (dt.microsecond / 1000000.0)
    >>> datetime.fromtimestamp(seconds) == dt
    True
    4bdb4492-5f48-4dc2-b86b-f6f343333818 commented 16 years ago

    That's expected as mktime is just a thin wrapper over libc mktime() and it does not expect microseconds. Changing time.mktime doesn't seems an option, so the best alternative is to implement a method in datetime type. Is there a real demand for C code implementing this to justify it?

    2eadb908-4a42-43e8-b4f5-d43cea0e9806 commented 16 years ago

    Attached a patch which adds a .totimetuple(...) method to datetime.datetime and tests for it.

    The intention is that the dt.totimetuple(...) method is equivalent to: mktime(dt.timetuple()) + (dt.microsecond / 1000000.0)

    2eadb908-4a42-43e8-b4f5-d43cea0e9806 commented 16 years ago

    Patch adding documentation for datetime.totimestamp(...).

    daf46b87-a9e9-4381-bf23-8c373c9135e6 commented 16 years ago

    I think the name is not good, should be "toepoch" or something like that.

    9c8dc725-7460-4152-b710-dcece18f0a9a commented 16 years ago

    datetime has fromtimestamp already, so using totimestamp keeps naming consistency (see toordinal and fromordinal).

    vstinner commented 15 years ago

    See also bpo-1673409

    vstinner commented 15 years ago

    I like the method, but I have some comments about the new method:

    I wrote a similar patch before reading add-datetime-totimestamp-method.diff which does exactly the same... I attach my patch but both should be merged.

    vstinner commented 15 years ago

    Here is a merged patch of the three patches. Except the C implementation of datetime_totimestamp() (written by me), all code is written by hodgestar.

    abalkin commented 15 years ago

    I would like to voice my opposition the totimestamp method.

    Representing time as a float is a really bad idea (originated at Microsoft as I have heard). In addition to the usual numeric problems when dealing with the floating point, the resolution of the floating point timestamp varies from year to year making it impossible to represent high resolution historical data.

    In my opinion both time.time() returning float and datetime.fromtimestamp() taking a float are both design mistakes and adding totimestamp that produces a float will further promote a bad practice.

    I would not mind integer based to/from timestamp methods taking and producing seconds or even (second, microsecond) tuples, but I don't think changing fromtimestamp behavior is an option.

    vstinner commented 15 years ago

    Le Saturday 15 November 2008 02:15:30 Alexander Belopolsky, vous avez écrit :

    I don't think changing fromtimestamp behavior is an option.

    It's too late to break the API (Python3 is in RC stage ;-)), but we can create new methods like: datetime.fromepoch(seconds, microseconds=0) # (int/long, int) datetime.toepoch() -> (seconds, microseconds) # (int/long, int)

    abalkin commented 15 years ago

    On Fri, Nov 14, 2008 at 8:37 PM, STINNER Victor \report@bugs.python.org\ wrote:

    .. but we can create new methods like: datetime.fromepoch(seconds, microseconds=0) # (int/long, int)

    While 1970 is the most popular epoch, I've seen 1900, 2000 and even 2035 (!) being used as well. Similarly, nanoseconds are used in high resolution time sources at least as often as microseconds. This makes fromepoch() ambiguous and it is really unnecessary because it can be written as epoch + timedelta(0, seconds, microseconds).

    datetime.toepoch() -> (seconds, microseconds) # (int/long, int)

    I would much rather have divmod implemented as you suggested in bpo-2706 . Then toepoch is simply

    def toepoch(d):
        x, y = divmod(d, timedellta(0, 1))
        return x, y.microseconds
    vstinner commented 15 years ago

    Le Saturday 15 November 2008 04:17:50 Alexander Belopolsky, vous avez écrit :

    it is really unnecessary because it can be written as epoch + timedelta(0, seconds, microseconds).

    I tried yesterday and it doesn't work!

    datetime.datetime(1970, 1, 1, 1, 0)
    >>> t1 = epoch + timedelta(seconds=-1660000000)
    >>> t2 = datetime.fromtimestamp(-1660000000)
    >>> t2
    datetime.datetime(1917, 5, 26, 1, 53, 20)
    >>> t1 - t2
    datetime.timedelta(0)
    >>> t2 = datetime.fromtimestamp(-1670000000)
    >>> t2
    datetime.datetime(1917, 1, 30, 7, 6, 40)
    >>> t1 = epoch + timedelta(seconds=-1670000000)
    >>> t1 - t2
    datetime.timedelta(0, 3600)

    We lost an hour durint the 1st World War :-)

    Whereas my implementation using mktime() works:

    -1670000000.0

    a558875e-7079-4932-94fa-d28ec4c10cc1 commented 15 years ago

    Any thoughts to time zone/DST handling for naive datetime objects? E.g. suppose the datetime object was created by .utcnow or .utcfromtimestamp.

    For aware datetime objects, I think the time.mktime(dt.timetuple()) approach doesn't work; the tz info is lost in the conversion to time tuple.

    9a491559-3d98-4669-84da-7c6a0a5d2ae2 commented 15 years ago

    ----- "Alexander Belopolsky" \report@bugs.python.org\ wrote:

    Alexander Belopolsky \belopolsky@users.sourceforge.net\ added the comment:

    I would like to voice my opposition the totimestamp method.

    Representing time as a float is a really bad idea (originated at Microsoft as I have heard). In addition to the usual numeric problems when dealing with the floating point, the resolution of the floating point timestamp varies from year to year making it impossible to represent high resolution historical data.

    In my opinion both time.time() returning float and datetime.fromtimestamp() taking a float are both design mistakes and adding totimestamp that produces a float will further promote a bad practice.

    The point for me is that having to interact with Microsoft systems that require times means that the conversions have to be done. Is it better to have everybody re-implement this, with their own bugs, or to have a standard implementation? I think it's clearly better to have it as a method on the object. Of course, we should put docs in describing the pitfalls of this approach...

    abalkin commented 15 years ago

    On Mon, Nov 24, 2008 at 9:04 AM, David Fraser \report@bugs.python.org\ wrote: ...

    The point for me is that having to interact with Microsoft systems that require times means that the conversions have to be done.

    I did not see the "epoch" proposal as an interoperability with Microsoft systems feature. If this is the goal, a deeper analysis of the Microsoft standards is in order. For example, what is the valid range of the floating point timestamp? What is the range for which fromepoch (float to datetime) translation is valid? For example, if all floats are valid timestamps, then fromepoch can be limited to +/- 2**31 or to a smaller range where a float has enough precision to roundtrip microseconds.

    Is it better to have everybody re-implement this, with their own bugs, or to have a standard implementation?

    As far as I know, interoperability with Microsoft systems requires re-implementation of their bugs many of which are not documented. For example, OOXML requires that 1900 be treated as a leap year at least in some cases. When you write your own implementation, at least you have the source code to your own bugs.

    I think it's clearly better to have it as a method on the object. Of course, we should put docs in describing the pitfalls of this approach...

    Yes, having a well documented high resolution "time since epoch" to "local datetime" method in the datetime module is helpful if non-trivial timezones (such as the one Victor lives in) are supported. However, introducing floating point pitfalls into the already overcomplicated realm of calendar calculations would be a mistake.

    I believe the correct approach would be to extend fromtimestamp (and utcfromtimestamp) to accept a (seconds, microseconds) tuple as an alternative (and in addition) to the float timestamp. Then totimestamp can be implemented to return such tuple that fromtimestamp(totimestamp(dt) == dt for any datetime dt and totimestamp(fromtimestamp((s,us))) == (s, us) for any s and us within datetime valid range (note that s will have to be a long integer to achieve that).

    In addition exposing the system gettimeofday in the time module to produce (s, us) tuples may be helpful to move away from float timestamps produced by time.time(), but with totimestamp as proposed above that would be equivalent to datetime.now().totimestamp().

    vstinner commented 15 years ago

    About the timestamp, there are many formats:

    (a) UNIX: 32 bits signed integer, number of seconds since the 1st january 1970.

    (b) UNIX64: 64 bits signed integer, number of seconds since the 1st january 1970

    (c) UNIX: 32 bits unsigned integer, number of seconds since the 1st january 1904

    (d) UUID60: 60 bits unsigned integer, number of 1/10 microseconds since the 15st october 1582

    (e) Win64: 64 bits unsigned integer, number of 1/10 microseconds since the 1st january 1601

    (f) MSDOS DateTime or TimeDate: bitfield with 16 bits for the date and 16 bits for the time. Time precision is 2 seconds, year is in range [1980; 2107]

    vstinner commented 15 years ago

    Timedelta formats:

    (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond

    (b) 64 bits float, number of seconds

    Other file formats use multiple numbers to store a duration:

    [AVI video]

    [WAV audio]

    [Ogg Vorbis]

    abalkin commented 15 years ago

    That's an impressive summary, but what is your conclusion? I don't see any format that will benefit from a subsecond timedelta.totimestamp(). Your examples have either multisecond or submicrosecond resolution.

    On Mon, Nov 24, 2008 at 11:00 AM, STINNER Victor \report@bugs.python.org\ wrote:

    STINNER Victor \victor.stinner@haypocalc.com\ added the comment:

    Timedelta formats:

    (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond

    • file format: Microsoft Word document (.doc), ASF video (.asf)

    (b) 64 bits float, number of seconds

    • file format: AMF metadata used in Flash video (.flv)

    Other file formats use multiple numbers to store a duration:

    [AVI video]

    • 3 integers (32 bits unsigned): length, rate, scale
    • seconds = length / (rate / scale)
    • (seconds = length * scale / rate)

    [WAV audio]

    • 2 integers (32 bits unsigned): us_per_frame, total_frame
    • seconds = total_frame * (1000000 / us_per_frame)

    [Ogg Vorbis]

    • 2 integers: sample_rate (32 bits unsigned), position (64 bits unsigned)
    • seconds = position / sample_rate

    Python tracker \report@bugs.python.org\ \http://bugs.python.org/issue2736\


    vstinner commented 15 years ago

    Ooops, timestamp (c) is the *Mac* timestamp: seconds since the 1st january 1904.

    what is your conclusion?

    Hum, it's maybe not possible to choose between integer and float. Why not supporting both? Example:

    Attached file (timestamp.py) is a module to import/export timestamp in all listed timestamp formats. It's written in pure Python. ----------------

    >>> import timestamp
    >>> from datetime import datetime
    >>> now = datetime.now()
    >>> now
    datetime.datetime(2008, 11, 24, 18, 7, 50, 216762)
    
    >>> timestamp.exportUnix(now)
    1227550070
    >>> timestamp.exportUnix(now, True)
    1227550070.2167621
    >>> timestamp.exportMac(now)
    3310394870L
    >>> timestamp.exportWin64(now)
    128720236702167620L
    >>> timestamp.exportUUID(now)
    134468428702167620L
    
    >>> timestamp.importMac(3310394870)
    datetime.datetime(2008, 11, 24, 18, 7, 50)
    >>> timestamp.importUnix(1227550070)
    datetime.datetime(2008, 11, 24, 18, 7, 50)
    >>> timestamp.importUnix(1227550070.2167621)
    datetime.datetime(2008, 11, 24, 18, 7, 50, 216762)

    It supports int and float types for import and export.

    abalkin commented 15 years ago

    On Mon, Nov 24, 2008 at 12:13 PM, STINNER Victor \report@bugs.python.org\ wrote: ..

    Hum, it's maybe not possible to choose between integer and float. Why not supporting both? Example:

    • totimestamp()->int: truncate microseconds
    • totimestamp(microseconds=True)->float: with microseconds

    I would still prefer totimestamp()->(int, int) returning (sec, usec) tuple. The important benefit is that such totimestamp() will not loose information and will support more formats than either of your ->int or ->float variants. The ->int can then be spelt simply as totimestamp()[0] and on systems with numpy (which is likely for users that deal with floats a lot), totimestamp(microseconds=True) is simply dot([1, 1e-6], totimestamp()). (and s,us = totimestamp(); return s + us * 1e-6 is not that hard either.)

    vstinner commented 15 years ago

    > Hum, it's maybe not possible to choose between integer and float. Why > not supporting both? Example: > - totimestamp()->int: truncate microseconds > - totimestamp(microseconds=True)->float: with microseconds

    I would still prefer totimestamp()->(int, int) returning (sec, usec) tuple. The important benefit is that such totimestamp() will not loose information

    Right, I prefer your solution ;-)

    abalkin commented 15 years ago

    On Mon, Nov 24, 2008 at 12:34 PM, STINNER Victor \report@bugs.python.org\ wrote: ..

    > I would still prefer totimestamp()->(int, int) returning (sec, usec) > tuple. The important benefit is that such totimestamp() will not > loose information

    Right, I prefer your solution ;-)

    Great! What do you think about extending fromtimestamp(timestamp[, tz]) and utcfromtimestamp(timestamp) to accept a tuple for the timestamp?

    Also, are you motivated enough to bring this up on python-dev to get a community and BDFL blessings? I think this has a chance to be approved.

    9a491559-3d98-4669-84da-7c6a0a5d2ae2 commented 15 years ago

    ----- "STINNER Victor" \report@bugs.python.org\ wrote:

    STINNER Victor \victor.stinner@haypocalc.com\ added the comment:

    Timedelta formats:

    (a) Win64: 64 bits unsigned integer, number of 1/10 microsecond

    • file format: Microsoft Word document (.doc), ASF video (.asf)

    (b) 64 bits float, number of seconds

    • file format: AMF metadata used in Flash video (.flv)

    There are also the PyWinTime objects returned by PythonWin COM calls which are basically FILETIMEs I don't have time to get the details now but I recently submitted a patch to make them work with milliseconds - see http://sourceforge.net/tracker/index.php?func=detail&aid=2209864&group_id=78018&atid=551954 (yes I know this is a bit off-topic here)

    vstinner commented 15 years ago

    belopolsky will be happy to see this new version of my patch:

    vstinner commented 15 years ago

    About mktime() -> -1: see the bpo-1726687 (I found the fix in this issue).

    Next job will be to patch datetime.(utc)fromtimestamp() to support (int, int). I tried to write such patch but it's not easy because fromtimestamp() will support: int, long, float, (int, int), (int, long), (long, int) and (long, long). And I don't know if a "long" value can be converted to "time_t".

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    Victor,

    As you explain in your own documentation, the proposed method is equivalent to (time.mktime(self.timetuple()), self.microsecond), so all it does is replacing a less than a one-liner. Moreover, I am not sure time.mktime(self.timetuple()) is something that people would want to do with a TZ-aware datetime. If the tzinfo of the datetime object does not match the system TZ used by mktime, the result will be quite misleading.

    On the patch itself:

    1. See my comment at bpo-1726687 about the tm_wday == 1 typo.

    2. I don't think time_t to long cast is safe on all platforms.

    mdickinson commented 14 years ago

    Close bpo-1673409 as a duplicate of this one; combining nosy lists.

    vstinner commented 14 years ago

    As you explain in your own documentation, the proposed method is equivalent to (time.mktime(self.timetuple()), self.microsecond), so all it does is replacing a less than a one-liner.

    a one-liner, but an horrible one liner :-) I don't like mixing datetime and time modules. I prefer to use only datetime, I prefer its API.

    ... If the tzinfo of the datetime object does not match the system TZ used by mktime, the result will be quite misleading.

    Can you suggest a possible fix to take care of the timezone information? I don't know how to use that.

    pitrou commented 14 years ago

    I agree with Victor that the APIs need improving, even if it involves providing obvious replacements of obscure one-liners. As an occasional user of datetime and time modules, I have too often wanted to curse those limited, awkwardly inconsistent APIs.

    Just my 2 seconds of course :-)

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    On Fri, May 21, 2010 at 7:26 AM, STINNER Victor \report@bugs.python.org\ wrote: ..

    >  ... If the tzinfo of the datetime object does not match the > system TZ used by mktime, the result will be quite misleading.

    Can you suggest a possible fix to take care of the timezone information? I don't know how to use that.

    I believe it should be something like this:

    from claendar import timegm
    def datetime_totimestamp(dt):
        return timegm(dt.utctimetuple()), dt.microsecond)

    Note the following comment in the documentation for tzinfo.fromutc(): "An example of a time zone the default fromutc() implementation may not handle correctly in all cases is one where the standard offset (from UTC) depends on the specific date and time passed, which can happen for political reasons. The default implementations of astimezone() and fromutc() may not produce the result you want if the result is one of the hours straddling the moment the standard offset changes." I have not tested the above code and it may not work for non-trivial time-zones.

    Still a few questions remain:

    1. Should absence of tzinfo imply local timezone or UTC?
    2. Given that datetime.fromtimestamp() takes an optional tz argument, should totimestamp() do the same and use given tz for naive datetime objects?
    3. Should there be a toutctimestamp()?

    I believe at this stage we need a python implementation of a prototype answering these questions and a unit test that would demonstrate how the prototype would work with nontrivial timezones.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    On Fri, May 21, 2010 at 7:37 AM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    I agree with Victor that the APIs need improving, even if it involves providing obvious replacements of obscure one-liners.

    While I agree that the datetime API can be improved, I don't think Victor's proposal does that. The advantage of an obscure one-liner is that it is obvious what it does, particularly for someone with a C/UNIX background. dt.totimestamp() may be easier to write, but it is entirely non-obvious what it will return. One would expect that dt.totimestamp() is the inverse of datetime.fromtimestamp(timestamp), but in timezones with daylight savings adjustments, but such inverse may not always exist. (01:59AM may be followed by 02:00 AM or by 01:00 AM. so on changeover days datetime(y, m, d, 1, 30).totimestamp() is either ambiguous or undefined.) As I suggested in my previous comment, this problem can be resolved, but we are not there yet.

    As an occasional user of datetime and time modules, I have too often wanted to curse those limited, awkwardly inconsistent APIs.

    Yes, it would be ideal if a user of datetime module would not need to reach to other modules for date/time calculations. See also \http://bugs.python.org/issue6280\. Do you have other examples of this sort?

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    On Fri, May 21, 2010 at 11:20 AM, Alexander Belopolsky \report@bugs.python.org\ wrote: ..

    I believe it should be something like this:

    from claendar import timegm

    s/claendar/calendar/, of course.

    pitrou commented 14 years ago

    The advantage of an obscure one-liner is that it is obvious what it does, particularly for someone with a C/UNIX background.

    Well, I would argue that the C/Unix legacy in terms of dates and times isn't an example to follow. Python does not force you to use strcat() to concatenate strings, either ;)

    But besides, the issue is more how people are supposed to invent that one-liner, let alone remember it easily. Perhaps adding it in the documentation would be a good middle ground, if you think it shouldn't be added to the stdlib.

    Do you have other examples of this sort?

    Well, for example, the datetime module encourages you to use "aware" datetime objects (rather than so-called "naive" objects), but there isn't a single facility to do so. You must reinvent a whole timezone class from scratch.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 14 years ago

    On Fri, May 21, 2010 at 12:20 PM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    Well, for example, the datetime module encourages you to use "aware" datetime objects (rather than so-called "naive" objects), but there isn't a single facility to do so. You must reinvent a whole timezone class from scratch.

    This is partially addressed by bpo-5094, "datetime lacks concrete tzinfo impl. for UTC". A more ambitious project would be to add pytz to stdlib. I believe I've seen this idea discussed and rejected, but I am not able to find a link to an appropriate thread now. A half-way project would be to add LocalTimezone given as an example in http://docs.python.org/dev/py3k/library/datetime.html in addition for UTC timezone. Any takers?

    42a80271-bbe2-4d37-b097-2bfedf49ce53 commented 13 years ago

    I'm very disappointed by the outcome of this discussion.

    You are committing the biggest sin of modern times - instead of promoting the obtaining and usage of knowledge to solve things, you place restrictions to force the dumbheads into not doing mistakes. The big problem with that is that you can never foresee all usecases and all possible mistakes, thus you will always be sorrily bitten by dumbheads. What does that make of you?

    Let me present you a situation - I have a system that passes data via JSON, datetime object is not JSON serializable. For few other reasons, like the epoch and float secs since epoch being defacto standard, and the fact that I absolutely make sure at-the-source that my timestamps are UTC and lack zone awareness, and the fact that I'm not going to display those, but only use them for comparison, and that I'm not going to do historical things and calculations and I don't actually need nanosecond precision, just a tenth of the second, and I'm fine with always using the '\<' and '>', not the '==', and the fact that 90% of the cases when use datetimes I have exactly the same requirements and it has always been working fine for me - I choose the lightweight float representation at the one side of the system. In the SQL DB I use tz unaware timestamps, not floats and my DB access layer returns datetime objects and I prefer them at this end of the system. So I only need to serialize the datetime object. Well, as a matter of fact I have a JSON object serialization already in place for some of my objects, but I do not need that for tz unaware datetimes. So I look for a method that makes a float from a datetime, which I'm used to in PHP, Java, .NET, C, SQL and you name it. And I'm 2 hours into reading about time, datetime and calendar modules and I still haven't even invented the obscure time.mktime(dt.timetuple())+dt.microseconds*1e-6 . And to even think that this creates a timetuple internally ? I hate it instantly and I dismiss the possibility that the API could be so wrong and I keep searching -> on the internets -> which brings me here where all my illusions are finally buried into the dust.

    2 Hours for something, that only needs a few warning lines in the docs? Ok, the ultimately right thing is to actually serialize the datetime object and rework my other end of the system to use dt instead of float .. maybe .. but not now - now I'm only testing an idea for something completely different and I only need faithful and dutiful Python to give me a float from datetime so I can check something. I love Python for being simple, logical and consistent and for giving me the tools and not telling me what to do with them. Not today ... Today Python goes - 'Here is your hammer, but you can not use it to hit straight down. If you hit straight down, and you are using a forge, and you miss your object and hit the forge instead, the hammer could ricochet and hit you back on the forehead, so you can't use it that way. As a matter of fact, there is a gyroscopic sensor embedded in the handle of the hammer and if you try to hit with an angle that is close to 90 degrees, it will detach the head of the hammer from the handle and prevent you from eventually endangering yourself' and I'm like 'WTF??! I'm nailing a nail into a wooden plank!'

    Now I'm going to use the obscure one liner and hate it, because it is simply wrong and only someone that doesn't care of implementation detail might think it equal to a proper solution. The worst thing is, that I learned today, that if I ever need operations with tz aware dates and time intervals in Python, I should actually send an SQL query for that, because my DB has a clean, simple and COMPLETE date/time API that works seamlessly. Yours is a jungle and I see you being asked to include a ready made patch to output a float from a dt, to which you respond by adding a locatime() method 2 years later. You seriously think, that bpo-9527 solves this? I don't even see a connection.

    With bpo-9527 in the python library I would be exactly what I am now - overly frustrated and with the exactly same amount of time completely lost into studying a bunch of tools only to realize that I should avoid using them at all costs.

    I'm sorry if I offend somebody by posting this emotional message, I just try to give you another point of view - don't put restrictions and hide the reasoning. Instead, support the thing that is widespread and advise that in certain conditions there are better things to do. And if it doesn't work for some edge cases, or even for half the cases - place a well elaborated warning. Then if programmers still make the mistake - well, let them learn by it. 'Cause that's the way people learn .. they make mistakes. By preventing them from making the mistake, you actually rob them of learning.

    bitdancer commented 13 years ago

    Alexander, I agree with Velko in that it isn't obvious to me how the addition of localtime would answer the desire expressed in this issue. It addresses Antoine's complaint about aware datetimes, but I don't see that it does anything for the "conversion to epoch based timestamp" issue. That is at the very least a documentation issue, since IMO we should be providing our users with the tools they need to interoperate with the systems they need to interoperate with.

    Velko: on the other hand, given Victor's research, I don't see float seconds since an epoch appearing anywhere as a standard. Where do you see this being used as a standard? I also don't understand your complaint about the fact that the one-liner creates a timetuple. datetime stores the date and time information as discrete fields, so generating a timetuple is a natural conversion path.

    Obviously one could avoid the creation of a Python tuple by calling the C mktime directly in the C code, as has been proposed. I don't see, myself, what would be so bad about providing a 'to_crt_timestamp' method that would, in essence, be the kind of light wrapper around the system API that we provide in so many other places in Python.

    pitrou commented 13 years ago

    Velko: on the other hand, given Victor's research, I don't see float seconds since an epoch appearing anywhere as a standard.

    Well, given that we already have fromtimestamp(), this sounds like a poor argument against a totimestamp() method (or whatever it gets called).

    42a80271-bbe2-4d37-b097-2bfedf49ce53 commented 13 years ago

    on the other hand, given Victor's research, I don't see float seconds since an epoch appearing anywhere as a standard. Where do you see this being used as a standard?

    Yes, I didn't mean standard as in RFCed and recommended and dominant, sorry if it sounded that way. I meant just that it is quite common in many places, big and small.

    I also don't understand your complaint about the fact that the one-liner creates a timetuple. datetime stores the date and time information as discrete fields, so generating a timetuple is a natural conversion path.

    Well, the timetuple is not a tuple, but an object filled with attributes. It contains a few more than are required for this conversion and it doesn't contain one that is required. Therefore I really see that as an inelegant and ineffective way to do the conversion.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 13 years ago

    On Fri, Dec 17, 2010 at 9:18 AM, R. David Murray \report@bugs.python.org\ wrote:

    R. David Murray \rdmurray@bitdance.com\ added the comment:

    Alexander, I agree with Velko in that it isn't obvious to me how the addition of localtime would answer the desire expressed in this issue.

    Conversion of UTC datetime to time stamp is trivial:

    EPOCH = datetime(1970, 1, 1)
    def timestamp(t):
          return (t - EPOCH).total_seconds()

    There are several reasons not to include this one-liner in stdlib (other than it being a one-liner).

    1. Different application may need different epoch and retained precision depends on the choice of the epoch.

    2. The code above works only on naive datetime objects assumed to be in UTC. Passing say a result of datetime.now() to it is likely to result in a hard to find bug.

    3. While it is not hard to extend the timestamp(t) code to cover aware datetime objects that use fixed offset tzinfo such as those with tzinfo set to a datetime.timezone instance, it is not well defined for the "smart" tzinfo implementations that do automatic DST adjustment. This is where the localtime (bpo-9527) issue comes into play.

    pitrou commented 13 years ago
    1. Different application may need different epoch and retained precision depends on the choice of the epoch.

    But then why does fromtimestamp() exist? And returning a (seconds, microseconds) tuple does retain the precision.

    1. The code above works only on naive datetime objects assumed to be in UTC.

    So, if the "trivial" code doesn't work, you can't bring it up as an argument against shipping this functionality, right?

    1. While it is not hard to extend the timestamp(t) code to cover aware datetime objects that use fixed offset tzinfo such as those with tzinfo set to a datetime.timezone instance, it is not well defined for the "smart" tzinfo implementations that do automatic DST adjustment.

    Still, fromtimestamp() exists and apparently fulfills people's expectations. So why can't the same strategy be used for totimestamp() as well?

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 13 years ago

    On Fri, Dec 17, 2010 at 12:17 PM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    > 1. Different application may need different epoch and retained > precision depends on the choice of the epoch.

    But then why does fromtimestamp() exist?

    A better question is why datetime.utcfromtimestamp(s) exists given that it is actually longer than equivalent EPOCH + timedelta(0, s)? I am not responsible for either of these methods, but at least datetime.fromtimestamp(s, tz) is well defined for any timezone and timestamp unlike its inverse.

    And returning a (seconds, microseconds) tuple does retain the precision.

    It does, but it does not help much those who want a float - they would still need another line of code. Note that with divmod(timedelta, timedelta), you can now easily extract (seconds, microseconds) or any other tuple like (weeks, days, seconds. microseconds) from timedelta objects. See msg75904 above.

    > 2. The code above works only on naive datetime objects assumed to be > in UTC.

    So, if the "trivial" code doesn't work, you can't bring it up as an argument against shipping this functionality, right?

    Well, no one has come up with the code that does work so far. Note that timetuple path does not work either because it does not fill tm_isdst correctly. The only solution I can think of for having proper inverse to fromtimestamp() is to add isdst to datetime objects. This would allow correct round-tripping between datetime and timetuple and datetime and timestamp.

    > 3. While it is not hard to extend the timestamp(t) code to cover aware > datetime objects that use fixed offset tzinfo such as those with > tzinfo set to a datetime.timezone instance, it is not well defined for > the "smart" tzinfo implementations that do automatic DST adjustment.

    Still, fromtimestamp() exists and apparently fulfills people's expectations. So why can't the same strategy be used for totimestamp() as well?

    Because in certain timezones fromtimestamp() can return the same datetime value for different timestamps and some datetime values do not have a corresponding timestamp. I have not seen a working proposal on how to handle these issues yet. You are asking to provide an inverse to an existing function simply because the function exists. But the function in question is not invertible.

    pitrou commented 13 years ago

    >> 1. Different application may need different epoch and retained >> precision depends on the choice of the epoch. > > But then why does fromtimestamp() exist?

    A better question is why datetime.utcfromtimestamp(s) exists given that it is actually longer than equivalent EPOCH + timedelta(0, s)?

    ??? EPOCH is not even a constant in the datetime module.

    And regardless, the point is *not* the number of characters typed, but how easy it is to come up with the solution. Calling the appropriate (and appropriately-named) method is much easier than coming up with the right datetime arithmetic incantation. It's Python, not Perl. "There should be one obvious way to do it".

    > And returning a (seconds, microseconds) tuple does retain the precision. >

    It does, but it does not help much those who want a float - they would still need another line of code.

    Yes, but a very obvious one at least.

    Note that with divmod(timedelta, timedelta), you can now easily extract (seconds, microseconds) or any other tuple like (weeks, days, seconds. microseconds) from timedelta objects.

    Do you think many users even think of calling divmod() timedelta objects? I don't, personally.

    You apparently hold the opinion that the datetime module should be reserved for experts in arithmetic over dates, times and timedeltas. But it's not. It's the Python stdlib and it should provide reasonably high-level tools to do the job.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 13 years ago

    On Fri, Dec 17, 2010 at 1:17 PM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    > A better question is why datetime.utcfromtimestamp(s) exists given > that it is actually longer than equivalent EPOCH + timedelta(0, s)?

    ??? EPOCH is not even a constant in the datetime module.

    No, and it does not belong there. A higher level library that uses seconds since epoch for interchange may define it (and make a decision whether it should be a naive datetime(1970, 1, 1) or datetime(1970, 1, 1, tzinfo=timezone.utc)).

    And regardless, the point is *not* the number of characters typed, but how easy it is to come up with the solution. Calling the appropriate (and appropriately-named) method is much easier than coming up with the right datetime arithmetic incantation. It's Python, not Perl. "There should be one obvious way to do it".

    I don't see anything obvious about the choice between utcfromtimestamp(s), fromtimestamp(s) and utcfromtimestamp(s, timezone.utc).

    datetime(1970, 1, 1) + timedelta(seconds=s)

    is obvious, self-contained, short and does not require any knowledge other than elementary school arithmetic to understand. Compared to this, "utcfromtimestamp" is a monstrosity that suggests that something non-trivial, such as UTC leap seconds is been taken care of.

    > > And returning a (seconds, microseconds) tuple does retain the precision. > > > > It does, but it does not help much those who want a float - they would > still need another line of code.

    Yes, but a very obvious one at least.

    Let's see:

    def floattimestamp(t):
          s, us = t.totimestamp()
          return s + us * 1e-6

    and

    def floattimestamp(t):
          s, us = t.totimestamp()
          return s + us / 1000000

    which one is *obviously correct? Are they *obviously equivalent?

    Note that when timedelta.total_seconds() was first committed, it contained a numerical bug. See bpo-8644.

    > Note that with divmod(timedelta, > timedelta), you can now easily extract   (seconds, microseconds)  or > any other tuple like (weeks, days, seconds. microseconds) from > timedelta objects.

    Do you think many users even think of calling divmod() timedelta objects? I don't, personally.

    You apparently hold the opinion that the datetime module should be reserved for experts in arithmetic over dates, times and timedeltas. But it's not. It's the Python stdlib and it should provide reasonably high-level tools to do the job.

    Sure, but if the goal is to implement json serialization of datetime objects, maybe stdlib should provide a high-level tool for *that* job? Using float representation of datetime is probably the worst option for json: it is non-standard, may either loose information or introduce spurious differences, and is not human-readable.

    In any case, you ignore the hard question about totimestamp(): fromtimestamp() is not invertible in most real life timezones. If you have a solution that does not restrict totimestamp() to UTC, I would like to hear it. Otherwise, I don't see any problem with (t - datetime(1970, 1, 1)).total_seconds() expression. Maybe we can add this recipe to utcfromtimestamp() documentation.

    pitrou commented 13 years ago

    > ??? EPOCH is not even a constant in the datetime module. > No, and it does not belong there.

    And so what was your point exactly?

    A higher level library that uses seconds since epoch for interchange

    I don't think the "time" module can be named "higher level", and it still handles such timestamps.

    datetime(1970, 1, 1) + timedelta(seconds=s)

    is obvious, self-contained, short and does not require any knowledge other than elementary school arithmetic to understand.

    Sigh. Again: it's so obvious that you're the only one who seems to easily come up with those solutions. How many times does it have to be repeated?

    Compared to this, "utcfromtimestamp" is a monstrosity that suggests that something non-trivial, such as UTC leap seconds is been taken care of.

    I don't see anything suggesting it is a monstrosity. The name is grammatically bizarre, but that's all.

    Let's see: [snip]

    which one is *obviously correct? Are they *obviously equivalent?

    Both are obviously correct for whatever the non-perverted user has in mind. People in real life don't care whether they will retain microsecond precision when carrying a floating point timestamp around. For the simple reason that the data source itself will not have such precision.

    Note that when timedelta.total_seconds() was first committed, it contained a numerical bug. See bpo-8644.

    So? What is your point?

    In any case, you ignore the hard question about totimestamp(): fromtimestamp() is not invertible in most real life timezones. If you have a solution that does not restrict totimestamp() to UTC, I would like to hear it.

    IMO, the solution would have the datetime object carry the offset from UTC with it, rather than try to be smart and compute it dynamically.

    5579dc13-9f48-42d1-bb17-9c003ef6fa70 commented 13 years ago

    On Fri, Dec 17, 2010 at 2:35 PM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    I don't think the "time" module can be named "higher level", and it still handles such timestamps.

    > datetime(1970, 1, 1) + timedelta(seconds=s) > > is obvious, self-contained,  short and does not require any knowledge > other than elementary school arithmetic to understand.

    Sigh. Again: it's so obvious that you're the only one who seems to easily come up with those solutions. How many times does it have to be repeated?

    Remember, most of the code is written once, but read and edited many times. Show me one person who will have trouble understanding what datetime(1970, 1, 1) + timedelta(seconds=s) means and show me another who can understand datetime.utcfromtimestamp(s) without reading the manual.

    > Compared to > this, "utcfromtimestamp" is a monstrosity that suggests that something > non-trivial, such as UTC leap seconds is been taken care of.

    I don't see anything suggesting it is a monstrosity. The name is grammatically bizarre, but that's all.

    Yes, UTC not being a proper acronym in any human language is one problem, Python datetime not being able to represent some valid UTC times is another.

    That's correct, but most users expect their timestamps to be the same when saved on one system and read on another. Granted, most users expect the same from their floats as well, but this can only be solved by education. Calendaric calculations are complex enough that we don't want to expose users to floating point gotchas at the same time.

    > Note that when timedelta.total_seconds() was first committed, it > contained a numerical bug.  See bpo-8644.

    So? What is your point?

    I thought the point was obvious: conversion between time values and float is non-trivial and error prone. Users should not be encouraged to casually convert (seconds, microseconds) tuples to floats. If they do, chances are they will do it differently in different parts of the program.

    > In any case, you ignore the hard question about totimestamp(): > fromtimestamp() is not invertible in most real life timezones.  If you > have a solution that does not restrict totimestamp() to UTC, I would > like to hear it.

    IMO, the solution would have the datetime object carry the offset from UTC with it, rather than try to be smart and compute it dynamically.

    Ditto. This is exactly what bpo-9527 is attempting to achieve.

    pitrou commented 13 years ago

    Yes, UTC not being a proper acronym in any human language is one problem,

    Ok. Too bad you don't live on the same planet than most of us. I bail out.

    abalkin commented 13 years ago

    On Fri, Dec 17, 2010 at 3:26 PM, Antoine Pitrou \report@bugs.python.org\ wrote: ..

    > Yes, UTC not being a proper acronym in any human language is one > problem,

    Ok. Too bad you don't live on the same planet than most of us. I bail out.

    Sorry that my attempt at humor has proven to be too subtle. I was referring to the following fact:

    """ The International Telecommunication Union wanted Coordinated Universal Time to have the same symbol in all languages. English and French speakers wanted the initials of both their respective language's terms to be used internationally: "CUT" for "coordinated universal time" and "TUC" for "temps universel coordonné". This resulted in the final compromise of "UTC". """

    http://en.wikipedia.org/wiki/Coordinated_Universal_Time

    vstinner commented 13 years ago

    It looks like it's not possible to choose between float and (int, int) output type for datetime.totimestamp(). One is more practical (and enough for people who doesn't need an exact result), and one is needed to keep the same resolution than the datetime object. I think that we can add two methods:

    I choosed the shortest name for float because I suppose that most users prefer float than a tuple, and so the API is symmetrical:

    My patch have to be updated to use the timezone (and the DST thing?) and also to update the Python implementation.

    9647ba2a-5717-4481-b336-914e78a93294 commented 13 years ago

    I am extremely disappointed by what has happened here.

    We are talking about a very simple method that everybody needs, and that has been reimplemented over and over again. I have been frustrated countless times by the lack of a utctotimestamp() method. I have watched beginners and experienced programmers alike suffer over and over again for the lack of this method, and spend hours trying to figure out why Python doesn't have it and how it should be spelled in Python.

    The discussion here has been stuck on assumptions that the method must meet all of the following ideals:

    1. It must produce a value that is easy to compute with
    2. It must have perfect precision in representing microseconds, forever
    3. It must make an exact round-trip for any possible input
    4. It must let users use whatever epoch they want

    These ideals cannot all be met simultaneously and perfectly. The correct thing to do as an engineer is to choose a practical compromise and document the decision.

    The compromise that almost everyone chooses (because it is useful, convenient, has microsecond precision at least until the year 2100, and millisecond precision is frequently sufficient) is to use a floating-point number with an epoch of 1970-01-01. Floating-point seconds can be easily subtracted, added, serialized, and deserialized, and are a primitive data type in nearly every language and database. They are unmatched in ease of use. So everyone wastes time searching for the answer and figuring out how to write:

        import calendar
        calendar.timegm(dt.utctimetuple()) + dt.microsecond * 1e-6

    We should use this as the definition of datetime.utctotimestamp(), document its limitations, and be done with it.

    Instead, this essential and useful method has now been held up for almost three YEARS by an inability to accept a simple engineering decision. Unbelievable.