python / cpython

The Python programming language
https://www.python.org
Other
63.14k stars 30.23k forks source link

datetime.utcfromtimestamp rounds results incorrectly #67705

Closed fd319893-c29a-4319-934f-b31206c2403f closed 9 years ago

fd319893-c29a-4319-934f-b31206c2403f commented 9 years ago
BPO 23517
Nosy @tim-one, @mdickinson, @abalkin, @vstinner, @larryhastings, @bitdancer, @serhiy-storchaka
Files
  • round_half_even_py34.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/abalkin' closed_at = created_at = labels = ['type-bug', 'library'] title = 'datetime.utcfromtimestamp rounds results incorrectly' updated_at = user = 'https://bugs.python.org/tbarbugli' ``` bugs.python.org fields: ```python activity = actor = 'vstinner' assignee = 'belopolsky' closed = True closed_date = closer = 'vstinner' components = ['Library (Lib)'] creation = creator = 'tbarbugli' dependencies = [] files = ['40414'] hgrepos = [] issue_num = 23517 keywords = ['patch', '3.3regression'] message_count = 100.0 messages = ['236552', '236553', '236577', '236580', '236581', '236585', '236587', '236589', '236597', '236607', '236608', '236609', '236610', '246049', '246073', '246097', '246104', '246121', '246284', '246306', '246307', '246334', '248942', '249282', '249300', '249301', '249303', '249304', '249307', '249308', '249309', '249519', '249520', '249521', '249525', '249528', '249530', '249532', '249539', '249569', '249611', '249728', '249744', '249771', '249777', '249778', '249779', '249786', '249787', '249788', '249791', '249796', '249798', '249825', '249827', '249834', '249835', '249836', '249841', '249842', '249844', '249845', '249847', '249848', '249849', '249850', '249851', '249853', '249856', '249875', '249879', '249885', '249886', '249898', '249909', '249910', '249925', '249926', '250043', '250047', '250048', '250049', '250066', '250075', '250083', '250101', '250103', '250104', '250107', '250108', '250113', '250259', '250265', '250268', '250269', '250270', '250304', '250405', '250973', '250974'] nosy_count = 13.0 nosy_names = ['tim.peters', 'mark.dickinson', 'belopolsky', 'vstinner', 'larry', 'r.david.murray', 'aconrad', 'BreamoreBoy', 'vivanov', 'python-dev', 'serhiy.storchaka', 'tbarbugli', 'trcarden'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'commit review' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue23517' versions = ['Python 3.4', 'Python 3.5', 'Python 3.6'] ```

    fd319893-c29a-4319-934f-b31206c2403f commented 9 years ago

    Hi,

    I am porting a library from python 2.7 to 3.4 and I noticed that the behaviour of datetime.utcfromtimestamp is not consistent between the two versions.

    For example on python 2.7.5 datetime.utcfromtimestamp(1424817268.274) returns a datetime with 274000 microseconds

    the same code in python 3.4 returns a datetime with 273999 microseconds.

    ethanfurman commented 9 years ago

    This seems to have changed in 3.3 (versions up to 3.2 return 274000).

    bitdancer commented 9 years ago

    Most likely this was a rounding fix (ie: not a bug), but hopefully Alexander will know for sure.

    abalkin commented 9 years ago

    Let me dig up the history, but this does not look like correct rounding to me:

    >>> datetime.utcfromtimestamp(1424817268.274)
    datetime.datetime(2015, 2, 24, 22, 34, 28, 273999)
    >>> decimal.Decimal(1424817268.274)
    Decimal('1424817268.2739999294281005859375')
    abalkin commented 9 years ago

    It looks like it was an intentional change. See bpo-14180 (changeset 75590:1e9cc1a03365).

    I am not sure what the motivation was. Note that this change made utcfromtimestamp(t) different from datetime(1970,1,1) + timedelta(seconds=t).

    abalkin commented 9 years ago

    Victor's motivation for the change was (msg154811):

    """ I chose this rounding method because it is the method used by int(float) and int(time.time()) is a common in programs (more than round(time.time()). Rounding towards zero avoids also producing timestamps in the future. """

    I recall the earlier discussions of rounding in the datetime module and Mark's explanation that rounding up is fine as long as ordering is preserved. i.e. for x \< y round(x) \<= round(y).

    There are cases when producing times in the future are problematic, for example UNIX make really dislikes when file timestamps are in the future, but if this was the main motivation - rounding towards -infinity would be more appropriate.

    In any case, as long as we have the following in the datetime module documentation, I think this behavior is a bug:

    """ On the POSIX compliant platforms, utcfromtimestamp(timestamp) is equivalent to the following expression:

    datetime(1970, 1, 1) + timedelta(seconds=timestamp) """

    >>> timestamp = 1424817268.274
    >>> datetime.utcfromtimestamp(timestamp) == datetime(1970, 1, 1) + timedelta(seconds=timestamp)
    False
    vstinner commented 9 years ago

    I started a large change set to support nanoseconds in the C "pytime" API: see the issue bpo-22117. While working on this change, I noticed that the rounding mode of datetime is currently wrong. Extract of a private patch:

    typedef enum { / Round towards zero. */ _PyTime_ROUND_DOWN=0, / Round away from zero. For example, used for timeout to wait "at least" N seconds. */ _PyTime_ROUND_UP=1, / Round towards minus infinity (-inf). For example, used for the system clock with UNIX epoch (time_t). \/ _PyTime_ROUND_FLOOR=2 } _PyTime_round_t;

    I changed Modules/_datetimemodule.c to use _PyTime_ROUND_FLOOR, instead of _PyTime_ROUND_DOWN.

    abalkin commented 9 years ago

    I noticed that the rounding mode of datetime is currently wrong.

    What do you mean by "currently"? What versions of python have it wrong?

    abalkin commented 9 years ago

    Victor,

    Would you consider going back to round to nearest? Mark and I put in a lot of effort to get the rounding in the datetime module right. (See for example, bpo-8860.)

    Sub-microsecond timesources are still rare and users who work with such should avoid FP timestamps in any case. On the other hand, double precision timestamps are adequate for microsecond resolution now and in the next few decades.

    Timestamps like OP's (sec=1424817268, us=274000) should not change when converted to double and back. IMO, the following behavior is a bug.

    >>> dt = datetime(2015, 2, 24, 22, 34, 28, 274000)
    >>> datetime.utcfromtimestamp(dt.timestamp())
    datetime.datetime(2015, 2, 25, 3, 34, 28, 273999)
    vstinner commented 9 years ago

    Would you consider going back to round to nearest?

    I don't understand "nearest". I prefer to use names of decimal rounding modes: https://docs.python.org/dev/library/decimal.html#rounding-modes

    In my local patch, I'm using ROUND_FLOOR in _decimal: "Round towards -Infinity."

    Mark and I put in a lot of effort to get the rounding in the datetime module right. (See for example, bpo-8860.)

    I'm unable right now to say which rounding mode should be used in the decimal module. But it's important to use the same rounding mode for all similar operations. For example, time.time() and datetime.datetime.now() should have the same rounding method (bad example, time.time() returns a float, which doesn't round the result).

    For example, in my local patch, I'm using ROUND_FLOOR for:

    Note: the Python implementation of datetime uses time.localtime() and time.gmtime() for fromtimestamp(), so these functions should also have the same rounding method.

    vstinner commented 9 years ago

    What do you mean by "currently"? What versions of python have it wrong?

    I search for "ROUND" in Modules/_datetimemodule.c: in the Python development branch (default), I found _PyTime_ROUND_DOWN (Round towards zero). Since a bug was reported, I understand that it's not the good rounding method?

    abalkin commented 9 years ago

    I don't understand "nearest".

    Sorry for using loose terms. I was hoping the in the context of "going back", it would be clear.

    I believe the correct mode is "ROUND_HALF_EVEN". This is the mode used by the builtin round() function:

    >>> round(0.5)
    0
    >>> round(1.5)
    2
    abalkin commented 9 years ago

    For example, in my local patch, I'm using ROUND_FLOOR for:

    • datetime.date.fromtimestamp()
    • datetime.datetime.fromtimestamp()

    These should use ROUND_HALF_EVEN

    • datetime.datetime.now()
    • datetime.datetime.utcnow()

    These should not involve floating point arithmetics, but when converting from nanoseconds to microseconds, you should round to nearest 1000 ns with 500 ns ties resolved to even number of microseconds.

    • os.utime()

    This takes nanoseconds as an optional argument. Passing floats in times should probably be deprecated. In any case, here you would be rounding floats to nanoseconds and what you do with 0.5 nanoseconds is less important because in most cases they are not even representable as floats.

    • time.clock_settime()

    Is this a new method? I don't see it in 3.5.0a1.

    • time.gmtime()

    This should be fixed

    >>> time.gmtime(1.999999999).tm_sec
    1

    is really bad and

    >>> time.gmtime(-1.999999999)[:6]
    (1969, 12, 31, 23, 59, 59)

    is probably even worse.

    • time.localtime()
    • time.ctime()

    Same story as in time.gmtime.

    27487188-3973-46e9-ab9c-cbf05e783aa3 commented 9 years ago

    We are seeing this behavior influencing other libraries in python 3.4.

    This should never fail if timestamp and fromtimestamp are implemented correctly:

    from datetime import datetime
    t = datetime.utcnow().timestamp()
    t2 = datetime.utcfromtimestamp(t)
    assert t == t2, 'Moving from timestamp and back should always work'
    bitdancer commented 9 years ago

    Because this seems to be a regression, I'm marking this as a release blocker. The RM can decide is isn't, of course.

    larryhastings commented 9 years ago

    Yes, by all means, fix for 3.4, 3.5, and 3.6. If possible I'd appreciate you getting the fix checked in to 3.5 within the next 48 hours, as I'm tagging the next beta release of 3.5 around then, and it'd be nice if this fix went out in that release.

    vstinner commented 9 years ago

    I'm concerned by this example:

    >>> dt = datetime(2015, 2, 24, 22, 34, 28, 274000)
    >>> dt - datetime.fromtimestamp(dt.timestamp())
    datetime.timedelta(0, 0, 1)

    I don't know yet if it should be fixed or not.

    If we modify .fromtimestamp(), should we use the same rounding method in datetime constructor? And in datetime.now()/.utcnow()?

    I would prefer to keep ROUND_DOWN for .now() and .utcnow() to avoid timestamps in the future. I care less for other methods.

    What do you think of this plan?

    ---

    Hum, I don't remember the whole story line of rounding timestamps in Python. Some raw data.

    Include/pytime.h of Python 3.5+ has:

    typedef enum { / Round towards minus infinity (-inf). For example, used to read a clock. */ _PyTime_ROUND_FLOOR=0, / Round towards infinity (+inf). For example, used for timeout to wait "at least" N seconds. */ _PyTime_ROUND_CEILING } _PyTime_round_t;

    Include/pytime.h of Python 3.4 had:

    typedef enum { / Round towards zero. */ _PyTime_ROUND_DOWN=0, / Round away from zero. */ _PyTime_ROUND_UP } _PyTime_round_t;

    Include/pytime.h of Python 3.3 and older didn't have rounding.

    C files using pytime.h rounding in Python 3.4 (grep -l _PyTime_ROUND */*.c):

    Modules/_datetimemodule.c Modules/posixmodule.c Modules/selectmodule.c Modules/signalmodule.c Modules/_testcapimodule.c Modules/timemodule.c Python/pytime.c

    It is used by 3 mores C files in Python 3.5:

    Modules/socketmodule.c Modules/_ssl.c Modules/_threadmodule.c

    NEAREST was never implemented in pytime.h.

    If I recall correctly, there were inconsitencies between the Python and the C implementation of the datetime module. At least in Python 3.5, both implementations should be consistent (even if some people would prefer a different rounding method).

    The private pytime API was rewritten in Python 3.5 to get nanosecond resolution. This API is only used by the datetime module to get the current time.

    My rationale for ROUND_DOWN was to follow how UNIX rounds timestmaps. As Alexander wrote, UNIX doesn't like timestamps in the future, so rounding towards minus infinity avoids such issue. Rounding issues become more common on file timestamps with filesystems supporting microsecond resolution or event nanosecond resolution.

    abalkin commented 9 years ago

    Victor> I don't know yet if it should be fixed or not.

    It is my understanding that datetime -> timestamp -> datetime round-tripping was exact in 3.3 for datetimes not too far in the future (as of 2015), but now it breaks for datetime(2015, 2, 24, 22, 34, 28, 274000). This is clearly a regression and should be fixed.

    UNIX doesn't like timestamps in the future

    I don't think this is a serious consideration. The problematic scenario would be obtaining high-resolution timestamp (from say time.time()), converting it to datetime and passing it back to OS as a possibly 0.5µs higher value. Given that timestamp -> datetime -> timestamp roundtrip by itself takes over 1µs, it is very unlikely that by the time rounded value hits the OS it is still in the future.

    larryhastings commented 9 years ago

    I'm not going to hold up beta 3 while you guys argue about how to round up or down the number of angels that can dance on the head of a pin.

    vstinner commented 9 years ago

    Le vendredi 3 juillet 2015, Alexander Belopolsky \report@bugs.python.org\ a écrit :

    > UNIX doesn't like timestamps in the future

    I don't think this is a serious consideration. The problematic scenario would be obtaining high-resolution timestamp (from say time.time()), converting it to datetime and passing it back to OS as a possibly 0.5µs higher value. Given that timestamp -> datetime -> timestamp roundtrip by itself takes over 1µs, it is very unlikely that by the time rounded value hits the OS it is still in the future.

    In many cases the resolution is 1 second. For example, a filesystem with a resolution of 1second. Or an API only supporting a resolution of 1 second.

    With a resoltuion of 1 second, timestamps in the future are likely (50%).

    Sorry I don't remember all detail of timestamp rounding and all issues that I saw.

    vstinner commented 9 years ago

    My rationale is more general than datetime. But problems araise when different API use different rounding methods.

    abalkin commented 9 years ago

    I'll let others fight this battle. In my view, introducing floating point timestamp method for datetime objects was a mistake. See issue bpo-2736.

    Specifically, I would like to invite Velko Ivanov to rethink his rant at msg124197.

    If anyone followed his advise and started using timestamp method to JSON-serialize datetimes around 3.3, have undoubtedly being bitten by the present bug (but may not know it yet.)

    For those who need robust code, I will continue recommending (dt - EPOCH)/timedelta(seconds=1) expression over the timestamp method and for JSON serialization (dt - EPOCH) // datetime.resolution to convert to integers and EPOCH + n * datetime.resolution to convert back.

    tim-one commented 9 years ago

    It is really bad that roundtripping current microsecond datetimes doesn't work. About half of all microsecond-resolution datetimes fail to roundtrip correctly now. While the limited precision of a C double guarantees roundtripping of microsecond datetimes "far enough" in the future will necessarily fail, that point is about 200 years from now.

    Rather than argue endlessly about rounding, it's possible instead to make the tiniest possible change to the timestamp _produced_ at the start. Here's code explaining it:

        ts = d.timestamp()
        # Will microseconds roundtrip correctly?  For times far
        # enough in the future, there aren't enough bits in a C
        # double for that to always work.  But for years through
        # about 2241, there are enough bits.  How does it fail
        # before then?  Very few microsecond datetimes are exactly
        # representable as a binary float.  About half the time, the
        # closest representable binary float is a tiny bit less than
        # the decimal value, and that causes truncating 1e6 times
        # the fraction to be 1 less than the original microsecond
        # value.
        if int((ts - int(ts)) * 1e6) != d.microsecond:
            # Roundtripping fails.  Add 1 ulp to the timestamp (the
            # tiniest possible change) and see whether that repairs
            # it.  It's enough of a change until doubles just plain
            # run out of enough bits.
            mant, exp = math.frexp(ts)
            ulp = math.ldexp(0.5, exp - 52)
            ts2 = ts + ulp
            if int((ts2 - int(ts2)) * 1e6) == d.microsecond:
                ts = ts2
            else:
                # The date is so late in time that a C double's 53
                # bits of precision aren't sufficient to represent
                # microseconds faithfully.  Leave the original
                # timestamp alone.
                pass
        # Now ts exactly reproduces the original datetime,
        # if that's at all possible.

    This assumes timestamps are >= 0, and that C doubles have 53 bits of precision. Note that because a change of 1 ulp is the smallest possible change for a C double, this cannot make closest-possible unequal datetimes produce out-of-order after-adjustment timestamps.

    And, yes, this sucks ;-) But it's far better than having half of timestamps fail to convert back for the next two centuries. Alas, it does nothing to get the intended datetime from a microsecond-resolution timestamp produced _outside_ of Python. That requires rounding timestamps on input - which would be a better approach.

    Whatever theoretical problems may exist with rounding, the change to use truncation here is causing real problems now. Practicality beats purity.

    abalkin commented 9 years ago

    I wish we could use the same algorithm in datetime.utcfromtimestamp as we use in float to string conversion. This may allow the following chain of conversions to round trip in most cases:

    float literal -> float -> datetime -> seconds.microseconds string

    tim-one commented 9 years ago

    I wish we could use the same algorithm in datetime.utcfromtimestamp as we use in float to string conversion. This may allow the following chain of conversions to round trip in most cases:

    float literal -> float -> datetime -> seconds.microseconds string

    I don't follow. float->string produces the shortest string that reproduces the float exactly. Any flavor of changing a timestamp to a microsecond-precision datetime is essentially converting a float * 1e6 to an integer - there doesn't seem to be a coherent concept of "shortest integer" that could apply. We have to fill every bit a datetime has.

    A variant of the code I posted could be "good enough": take the result we get now (truncate float*1e6). Also add 1 ulp to the float and do that again. If the results are the same, we're done. If the results are different, and the difference is 1, take the second result. Else keep the first result. What this "means" is that we're rounding up if and only if the original is so close to the boundary that the tiniest possible amount of floating-point noise is all that's keeping it from giving a different result - but also that the float "has enough bits" to represent a 1-microsecond difference (which is true of current times, but in a couple centuries will become false).

    But that's all nuts compared to just rounding float*1e6 to the nearest int, period. There's nothing wrong with doing that. Truncating is propagating the tiniest possible binary fp representation error all the way into the microseconds. It would be defensible _if we were using base-10 floats (in which "representation error" doesn't occur for values expressed _in base 10). But we're not. Truncating a base-2 float _as if_ it were a base-10 float is certain to cause problems. Like the one this report is about ;-)

    abalkin commented 9 years ago

    I probably misremembered a different issue. See msg194311.

    >>> timedelta(seconds=0.6112295) == timedelta(seconds=1)*0.6112295
    False

    I thought the problem there was that the same float was converted to one decimal by str() and to a different decimal by timedelta. But now it looks like it was something else.

    Does your algorithm guarantee that any float that is displayed with 6 decimal places or less will convert to a datetime or timedelta with microseconds matching the fractional part?

    abalkin commented 9 years ago

    OK, I looked at the wrong place. Here is the correct example:

    >>> x = float.fromhex('0x1.38f312b1b36bdp-1')
    >>> x
    0.6112295
    >>> round(x, 6)
    0.611229
    >>> timedelta(0, x).microseconds
    611230

    but I no longer remember whether we concluded that timedelta got it wrong or round or both or neither. :-)

    tim-one commented 9 years ago

    Does your algorithm guarantee that any float that is displayed with 6 decimal places or less will convert to a datetime or timedelta with microseconds matching the fractional part?

    No algorithm can, for datetimes far enough in the future (C doubles just plain run out of enough bits).

    Apart from negative timestamps (which I didn't consider - they just blow up on my platform :-) ), the intent is to do the best that _can_ be done.

    But _proving things in this area isn't simple, and there's no need for it: check in a change to round the thing, and be done with it. If Victor wants to rework rounding again, that's fine, but only under a _requirement that this particular bug remain fixed. His change created the problem, and it's still languishing half a year after being reported - there's little sense in continuing to wait for him to do something about it.

    vstinner commented 9 years ago

    Hi, I'm trying to write the rationale of the changes that I wrote in pytime.h in Python 3.3-3.5. The rounding of microseconds or nanoseconds is tricky. The code changed a lot and we used and we are still using various rounding methods depending on the case...

    Alexander Belopolsky wrote:

    I believe the correct mode is "ROUND_HALF_EVEN". This is the mode used by the builtin round() function: (...)

    Right, round(float) and round(decimal.Decimal) uses the ROUND_HALF_EVEN rounding method.

    On Python \< 3.3, datetime.datetime.fromtimestamp(float) doesn't use exactly ROUND_HALF_EVEN, but it looks more to "round half away from zero" (the decimal module doesn't seem to support this exact rounding method).

    The difference between ROUND_HALF_EVEN and "round half away from zero" is subtle. The two rounding methods only return a different result on the following case:

    divmod(t + us 1e-6, 1.0)[1] 1e6 == 0.5

    where t and us are integers (t is a number of seconds created by mktime() and us is a number of microseconds in [0; 999999]).

    I don't think that the case can occur. I failed to find such case for various values of t between 0 and 2**40, and us=0 or us=1. 1e-6 (10^-6 = 0.000001) cannot be represented exactly in base 2 (IEEE 754).

    --

    To move forward, we should agree on which rounding method datetime.datetime.fromtimestamp() should use, implement it in pytime.c (add a constant in pytime.h, implement it in pytime.c, and then write unit tests in test_time.py), and then use it in datetime.datetime.fromtimestamp().

    IMHO we should only modify the rounding method used by datetime.datetime.fromtimestamp() and datetime.datetime.utcfromtimestamp(), other functions use the "right" rounding method.

    tim-one commented 9 years ago

    >>> x = float.fromhex('0x1.38f312b1b36bdp-1') >>> x 0.6112295 >>> round(x, 6) 0.611229 >>> timedelta(0, x).microseconds 611230

    but I no longer remember whether we concluded that timedelta got it wrong or round or both or neither. :-)

    Here you go:

    >>> import decimal
    >>> decimal.Decimal(x)
    Decimal('0.61122949999999998116351207499974407255649566650390625')

    That's the exact value you're actually using. What's "correct" depends on what's intended.

    round(x, 6) actually rounds to

    >>> decimal.Decimal(round(x, 6))
    0.6112290000000000222968310481519438326358795166015625

    and that's fine. timedelta's result does not match what using infinite precision would deliver, but I couldn't care much less ;-)

    The real lesson to take from all this, when you design your own killer language, is that using a binary floating point type for timestamps comes with many costs and surprises.

    tim-one commented 9 years ago

    IMHO we should only modify the rounding method used by datetime.datetime.fromtimestamp() and datetime.datetime.utcfromtimestamp(), other functions use the "right" rounding method.

    Fine by me. How about today? ;-)

    The regression reported here must get repaired. nearest/even is generally favored when there's a choice.

    I personally _prefer_ add-a-half-and-chop in time contexts that need rounding, because it's more uniform. That is, picturing a context that rounds to 1 digit for clarity, using a decimal system, with a uniformly spaced sequence of inputs on the first line, then a line with add-a-half-and-chop results, and then a line with nearest/even results:

    0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 input 0.0 1.0 1.0 2.0 2.0 3.0 3.0 4.0 add-a-half-and-chop 0.0 0.0 1.0 2.0 2.0 2.0 3.0 4.0 nearest/even

    From the last (nearest/even) line, you'd never guess that the inputs were uniformly spaced; in the second line, you would. nearest/even's "in a tie, sometimes go up, sometimes go down" is, IMO, unnatural in this context.

    But it doesn't make a lick of real difference to timestamp functions. We're not working in decimal, and people aren't going to be staring at hex patterns in output. So I'd pick whichever is easier to implement.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset abeb625b20c2 by Victor Stinner in branch 'default': Issue bpo-23517: Add "half up" rounding mode to the _PyTime API https://hg.python.org/cpython/rev/abeb625b20c2

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset b690bf218702 by Victor Stinner in branch 'default': Issue bpo-23517: datetime.datetime.fromtimestamp() and https://hg.python.org/cpython/rev/b690bf218702

    vstinner commented 9 years ago

    Ok, I fixed the issue in Python 3.6. Example with the initial message:

    $ python2.7 -c 'import datetime; print(datetime.datetime.utcfromtimestamp(1424817268.274).microsecond); print(datetime.datetime.utcfromtimestamp(-1424817268.274).microsecond)'
    274000
    726000
    
    $ python3.6 -c 'import datetime; print(datetime.datetime.utcfromtimestamp(1424817268.274).microsecond); print(datetime.datetime.utcfromtimestamp(-1424817268.274).microsecond)'
    274000
    726000

    I wrote: "On Python \< 3.3, datetime.datetime.fromtimestamp(float) doesn't use exactly ROUND_HALF_EVEN, but it looks more to "round half away from zero" (the decimal module doesn't seem to support this exact rounding method)."

    I was wrong: it's decimal.ROUND_HALF_UP in fact.

    I will backport the change to Python 3.4 and 3.5. Since this issue was defined as a bugfix, it should be fixed in Python 3.5.1 (too late for 3.5.0).

    larryhastings commented 9 years ago

    too late for 3.5.0

    How's that?

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset 30454ef98e81 by Victor Stinner in branch 'default': Backed out changeset b690bf218702 https://hg.python.org/cpython/rev/30454ef98e81

    New changeset 700303850cd7 by Victor Stinner in branch 'default': Issue bpo-23517: Fix _PyTime_ObjectToDenominator() https://hg.python.org/cpython/rev/700303850cd7

    New changeset 03c97bb04cd2 by Victor Stinner in branch 'default': Issue bpo-23517: Reintroduce unit tests for the old PyTime API since it's still https://hg.python.org/cpython/rev/03c97bb04cd2

    vstinner commented 9 years ago

    Larry Hasting wrote:

    > too late for 3.5.0 How's that?

    Well, for example... my change broke all buildbots.

    I don't think that it's good idea to rush to fix Python 3.5 :-) This part of Python (handling timestamps, especially the rounding mode) is complex, I prefer to check for all buildbots and wait for some feedback from users (wait at least 2 weeks).

    I reverted my change, another function must be changed:

    $ python2 -c 'import datetime; print(datetime.timedelta(microseconds=0.5))'
    0:00:00.000001
    $ python3 -c 'import datetime; print(datetime.timedelta(microseconds=0.5))'
    0:00:00

    datetime.timedelta must also use the ROUND_HALF_UP method, as Python 2, instead of ROUND_HALF_EVEN (datetime.py uses the round() function).

    $ python2 -c 'import datetime; print(datetime.timedelta(microseconds=1.5))'
    0:00:00.000002
    $ python3 -c 'import datetime; print(datetime.timedelta(microseconds=1.5))'
    0:00:00.000002

    I have to rework my patch to use ROUND_HALF_UP in datetime.timedelta(), datetime.datetime.fromtimestamp() and datetime.datetime.utcfromtimestamp(), and update test_datetime.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset df074eb2a5be by Victor Stinner in branch 'default': Issue bpo-23517: Try to fix test_time on "x86 Ubuntu Shared 3.x" buildbot https://hg.python.org/cpython/rev/df074eb2a5be

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset 59185ef69166 by Victor Stinner in branch 'default': Issue bpo-23517: test_time, skip a test checking a corner case on floating point https://hg.python.org/cpython/rev/59185ef69166

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset 0eb8c182131e by Victor Stinner in branch 'default': Issue bpo-23517: datetime.timedelta constructor now rounds microseconds to nearest https://hg.python.org/cpython/rev/0eb8c182131e

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset bf634dfe076f by Victor Stinner in branch 'default': Issue bpo-23517: fromtimestamp() and utcfromtimestamp() methods of https://hg.python.org/cpython/rev/bf634dfe076f

    larryhastings commented 9 years ago

    I will happy delegate to Tim Peters whether or not this should be fixed in 3.5.0, or whether it should wait until 3.5.1 or even 3.6.

    Tim, ball's in your court!

    vstinner commented 9 years ago

    Backport to Python 3.4 splitted in 3 patches:

    tim-one commented 9 years ago

    Larry, I appreciate the vote of confidence, but I'm ill-equipped to help at the patch level: I'm solely on Windows, and (long story) don't even have a C compiler at the moment. The patch(es) are too broad and delicate to be sure of without kicking the tires (running contrived examples).

    So I would sub-delegate to Alexander and/or Mark. They understand the issues too. I was just the most annoying about insisting it get fixed ;-)

    vstinner commented 9 years ago

    Larry, I appreciate the vote of confidence, but I'm ill-equipped to help at the patch level: (...) The patch(es) are too broad and delicate to be sure of without kicking the tires (running contrived examples).

    Well, the patches change how timedelta, .fromtimestamp() and .utcfromtimestamp() round the number of microseconds. It's a deliberate choice since it was decided that the current rounding mode is a bug, and not a feature :-)

    The code is well tested. There are unit tests on how numbers are rounded for: timedelta, .(utc)fromtimestamp(), and even the C private API _PyTime. The code is (almost) the same in default and was validated on various platforms. So I'm confident on the change.

    tim-one commented 9 years ago

    That's great, Victor! Another person trying the code with their own critical eyes would still be prudent. Two days ago you wrote:

    This part of Python (handling timestamps, especially the rounding mode) is complex, I prefer to check for all buildbots and wait for some feedback from users (wait at least 2 weeks).

    It's not entirely clear why that switched to "So I'm confident on the change." in 12 days short of 2 weeks ;-)

    I have no reason to doubt your confidence. Just saying some independent checking is prudent (but I can't do it at this time).

    abalkin commented 9 years ago

    I'll try to find the time to kick the tires on this patch this weekend.

    vstinner commented 9 years ago

    2015-09-04 17:52 GMT+02:00 Tim Peters \report@bugs.python.org\:

    That's great, Victor! Another person trying the code with their own critical eyes would still be prudent.

    Sure!

    It's not entirely clear why that switched to "So I'm confident on the change." in 12 days short of 2 weeks ;-)

    He he. 2 days ago, the buildbots were broken for various reasons. I fixed a lot of issues (unrelated to this rounding mode issue), so I now got the confirmation that the test pass on all platforms.

    I have no reason to doubt your confidence. Just saying some independent checking is prudent (but I can't do it at this time).

    Sorry if I wasn't clear. I'm confident, but not enough to not wait for a review :-)

    --

    Usually, I don't wait for a review simply because there are too few reviewers :-( I spent the last 3 years to work alone on the funnny _PyTime C API project. I started to write an article to tell this journey ;-)

    vstinner commented 9 years ago

    Alexander Belopolsky added the comment:

    I'll try to find the time to kick the tires on this patch this weekend.

    Cool! Keep me in touch ;-)

    abalkin commented 9 years ago

    Victor,

    Do I understand correctly that this is already committed in 3.4 - 3.6 development branches and we just need to decide whether to cherry-pick this fix to 3.5rc?

    Is the "review" link up-to date?