python / cpython

The Python programming language
https://www.python.org
Other
63.14k stars 30.23k forks source link

Add timespec optional flag to datetime isoformat() to choose the precision #63674

Closed smontanaro closed 8 years ago

smontanaro commented 10 years ago
BPO 19475
Nosy @malemburg, @gvanrossum, @tim-one, @terryjreedy, @abalkin, @vstinner, @ezio-melotti, @berkerpeksag, @vadmium, @matrixise, @alessandrocucci
Files
  • issue19475.patch: Patch
  • issue19475_v2.patch: Patch
  • issue19475_v3.patch: Patch
  • issue19475_v4.patch: Patch v4
  • issue19475_v5.patch: Patch v5
  • issue19475_v6.patch: Patch v6
  • issue19475_v7.patch: Patch v7
  • issue19475_v8.patch: Patch v8
  • issue19475_v9.patch: Patch v9
  • issue19475_v10_datetime_time.patch: added timespec to time.isoformat
  • issue19475_v11.patch: Patch v11
  • issue19475_v12.patch
  • issue19475_v13.patch: added milliseconds format to timespec
  • issue19475_v14.patch: added milliseconds to documentation
  • issue19475_v15.patch
  • issue19475_v16.patch
  • issue19475_v17.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/abalkin' closed_at = created_at = labels = ['extension-modules', 'easy', 'type-feature'] title = 'Add timespec optional flag to datetime isoformat() to choose the precision' updated_at = user = 'https://github.com/smontanaro' ``` bugs.python.org fields: ```python activity = actor = 'python-dev' assignee = 'belopolsky' closed = True closed_date = closer = 'python-dev' components = ['Extension Modules'] creation = creator = 'skip.montanaro' dependencies = [] files = ['40018', '40024', '40117', '41329', '41371', '41381', '41387', '41388', '41391', '41420', '41448', '41464', '41988', '41994', '42007', '42031', '42063'] hgrepos = [] issue_num = 19475 keywords = ['patch', 'easy'] message_count = 94.0 messages = ['201917', '201918', '201922', '201924', '201925', '201926', '201929', '201931', '201932', '201934', '201935', '201943', '202220', '202238', '202239', '202242', '202243', '202270', '202274', '202276', '202282', '221958', '221959', '247265', '247401', '247402', '247439', '247723', '247948', '256458', '256470', '256475', '256476', '256477', '256478', '256479', '256480', '256481', '256483', '256486', '256513', '256536', '256552', '256556', '256560', '256561', '256591', '256764', '256813', '256855', '256858', '256866', '256967', '256991', '257196', '257202', '257203', '257284', '257288', '257534', '258289', '258473', '258477', '258479', '258480', '258481', '258482', '258483', '258485', '258493', '260629', '260645', '260695', '260725', '260875', '260876', '260878', '261060', '261066', '261068', '261073', '261074', '261077', '261078', '261080', '261081', '261082', '261083', '261084', '261085', '261086', '261133', '261135', '261269'] nosy_count = 14.0 nosy_names = ['lemburg', 'gvanrossum', 'tim.peters', 'terry.reedy', 'belopolsky', 'vstinner', 'ezio.melotti', 'cvrebert', 'python-dev', 'berker.peksag', 'martin.panter', 'matrixise', 'jerry.elmore', 'acucci'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue19475' versions = ['Python 3.6'] ```

    8977102d-e623-4805-b309-cd0ad35b0d72 commented 8 years ago

    I think the patch is nearly finalised, but I'd appreciate if someone else would carefully go over the new C code. After that, I think, the patch could be committed.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Thanks SilentGhost!

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Can anyone please review the c code of the last patch?

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    I think timespec= option should also be added to the time.isoformat method.

    @belopolsky I've done it in my last patch.

    berkerpeksag commented 8 years ago

    Thanks for the patch, Alessandro. I left some comments about documentation part of the patch, but I can fix them myself if you don't have time.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    what about the comment left by SilentGhost about versionadded?

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Berker, thank you. In the last patch, I removed details about timespec options in Python and C docstrings, corrected the rst quotes, and checked PEP-7 in the c file.

    The only problem now is about versionchanged vs versionadded. I leave it as it was, as Silent and the official doc say, if you want to change it, i'll leave it to you.

    Again, many many thanks to all. This was my first issue here, I learnt a lot!

    vadmium commented 8 years ago

    I think there is a memory leak in the C code. I left some other minor suggestions as well, but it almost looks ready.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Thanks @martin.panter, here is another patch made after your comments.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    up

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    I there anything else I can do for this?

    8977102d-e623-4805-b309-cd0ad35b0d72 commented 8 years ago

    I there anything else I can do for this? I think you've done all you could, it's just someone needs to commit it. I don't think they're being rude on purpose - it's just that there are so many committers in the nosy list that no one is feeling like it's their job.

    gvanrossum commented 8 years ago

    I think Alexander has commit rights, he did most of the review, I trust him to commit it.

    terryjreedy commented 8 years ago

    Not seeing any indication that this has been tested on Windows, I applied, compiled (32 bit), and ran test_datetime on Win10 without error. I did not rebuild the doc.

    abalkin commented 8 years ago

    Is issue19475_v12.patch the final patch? I don't see it addressing Guido's suggestion in msg256470 to add milli- and nanoseconds options.

    8977102d-e623-4805-b309-cd0ad35b0d72 commented 8 years ago

    Yes, version 12 is the final patch. It doesn't add those options. To copy my opinion from the rietveld (https://bugs.python.org/review/19475/#msg14):

    I don't really think nanoseconds belong here. If they don't exist anywhere else in the module, why should they be suddenly introduced here? of all places. It would be fine to use some generic solution that would enable or ease their addition in the future, but they shall not be added at this time.

    (there are altogether 27 messages there, which everyone's naturally CCed on).

    abalkin commented 8 years ago

    I left some comments on Rietveld.

    abalkin commented 8 years ago

    I don't really think nanoseconds belong here.

    What about milliseconds? I'll leave it for Guido to make a call on nanoseconds. My vote is +0.5.

    If they don't exist anywhere else in the module, why should they be suddenly introduced here?

    The timespec feature is modeled after GNU date --iso-8601[=timespec] option which does support nanoseconds. It is fairly common to support nanoseconds these days and it does not cost much to implement.

    8977102d-e623-4805-b309-cd0ad35b0d72 commented 8 years ago

    What about milliseconds? I'll leave it for Guido to make a call on nanoseconds. My vote is +0.5. The only reason I didn't mention milliseconds because they exist in timedelta instantiation. And really, being the only place in the whole module they're as confusing there as would be nanoseconds.

    > If they don't > exist anywhere else in the module, why should they be suddenly > introduced here?

    The timespec feature is modeled after GNU date --iso-8601[=timespec] option which does support nanoseconds. It is fairly common to support nanoseconds these days and it does not cost much to implement.

    Yes, but the module does not support nanoseconds. And putting any such options would require a huge banner saying that the nanosecond option will just always result in three zeros at the end. My suggestion is not to pretend that we suddenly "support" nanoseconds, but rather to follow the actual implementation of the module and add the support for nanoseconds timespec when the module actually adds support for them.

    gvanrossum commented 8 years ago

    You can leave out the nanoseconds but please do add the milliseconds. I'm sure they would find more use than the option to show only the hours.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    New patch

    vadmium commented 8 years ago

    Left some review suggestions

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    New patch after @martin.panter comments on Rietveld. I left only this:

    vadmium 2016/02/21 23:30:20 I think this should explain that fractions are truncated to zero, never rounded up. At least for fractions of milliseconds, although this could apply to the other options as well.

    I think is quite obvious that a datetime.now() can't be rounded to the future if microseconds are 999500.

    vadmium commented 8 years ago

    About rounding: I’m not too sure what people would expect. Obviously it is much easier to implement truncating to zero. But it is different to many other rounding cases in Python; that is why I thought to make it explicit.

    >>> datetime.fromtimestamp(59.9999999).isoformat(timespec="microseconds")
    '1970-01-01T00:01:00.000000'
    >>> datetime.fromtimestamp(59.999999).isoformat(timespec="milliseconds")
    '1970-01-01T00:00:59.999'
    >>> format(59.999999, ".3f")
    '60.000'
    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Oh, now I see your point.

    I've uploaded a new patch with a note for that.

    gvanrossum commented 8 years ago

    Out of context here, but regarding round vs. truncate, IIUC for time truncating down is the norm. My digital clock shows "12:00" for the duration of the minute starting at noon. People look for clocks to flip to know when it is exactly a given time (if the clock is accurate enough).

    abalkin commented 8 years ago

    We discussed truncation vs. rounding some time ago. See msg202270 and the posts around it. The consensus was the same as Guido's current advise: do the truncation.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    @belopolsky could you please review one of the latest two patches submitted? I think I've done all required. Now I'll wait from you if I have to do more.

    abalkin commented 8 years ago

    Guido,

    Did you consider MAL's msg202274? I am still in favor of truncation, but would like to make sure we are not missing something that MAL knows from experience.

    abalkin commented 8 years ago

    Another argument for truncation is that this is what GNU date does:

    $ date --iso-8601=seconds --date="2016-03-01 15:00:00.999"
    2016-03-01T15:00:00-0500
    gvanrossum commented 8 years ago

    Given that we're talking about what to do when we're suppressing the usecs I don't think roundtripping matters. :-)

    vstinner commented 8 years ago

    Given that we're talking about what to do when we're suppressing the usecs I don't think roundtripping matters. :-)

    I changed many times how Python rounds nanoseconds in the private PyTime API, and I got a bug report because of that! => issue bpo-23517.

    By the way, I wrote an article to explain the history the private PyTime API, especially changes on rounding ;-) https://haypo.github.io/pytime.html

    gvanrossum commented 8 years ago

    But what should we do in your opinion?

    abalkin commented 8 years ago

    I hope my prediction "I am afraid that the rounding issues may kill this proposal" (see msg202276) will not come true.

    I think the correct way to view "timespec" is a way to suppress/enforce printing of trailing digits.

    Users that choose printing less than full usec format should make sure that their datetime instances are properly rounded before printing.

    Unfortunately, I does not look like the datetime module makes rounding easy. The best I can think of is something like

    def round_datetime(dt, delta):
        dt0 = datetime.combine(dt.date(), time(0))
        return dt0 + round((dt - dt0) / delta) * delta

    Maybe a datetime.round() method along these lines will be a worthwhile addition?

    vstinner commented 8 years ago

    But what should we do in your opinion?

    Use ROUND_FLOOR rounding method.

    time.time(), datetime.datetime.now(), etc. round the current time using the ROUND_FLOOR rounding method.

    Only datetime.datetime.fromtimestamp() uses ROUND_HALF_EVEN, but it's more an exception than the rule: this function uses a float as input. To be consistent, we must use the same rounding method than other Python functions taking float as parameter, like round(), so use ROUND_HALF_EVEN.

    So I suggest to also use ROUND_FLOOR for .isoformat().

    Hopefully, we don't have to discuss about the exact rounding method for negative numbers, since the minimum datetime object is datetime.datetime(1, 1, 1) which is "positive" ;-)

    You have a similar rounding question for file timestamps. Depending on the file system, you may have a resolution of 2 seconds (FAT), 1 second (ext3) or 1 nanosecond (ext4). But Linux syscalls accept subsecond resolution. The Linux kernel uses ROUND_FLOOR rounding method if I recall correctly. I guess that it's a requirement for makefiles. If you already experimented a system clock slew, you may understand me :-)

    For full seconds, truncation will add an error of +/- 1 second, whereas rounding only adds +/- 0.5 seconds. This is what convinced me to use rounding instead of truncation.

    What is truncation? Is it the ROUND_FLOOR (towards -inf) rounding method? Like math.floor(float).

    Python int(float) uses ROUND_DOWN (towards zero) which is different than ROUND_FLOOR, but only different for negative numbers. int(-0.9) returns 0, whereas math.floor(-0.9) returns -1.

    I guess that "rounding" means ROUND_HALF_EVEN here? The funny "Round to nearest with ties going to nearest even integer" rounding method. Like round(float).

    gvanrossum commented 8 years ago

    Except for the case where you're closer than half a usec from the next value, IMO rounding makes no sense when suppressing digits. I most definitely would never want 9:59:59 to be rounded to 10:00 when suppressing seconds. If you really think there are use cases for that you could add a 'round=True' flag (as long as it defaults to False). That seems better than supporting rounding on datetime objects themselves. But I think you're just speculating.

    gvanrossum commented 8 years ago

    IIUC truncation traditionally means "towards zero" -- that's why we have separate "floor" and "ceiling" operations meaning "towards [negative] infinity". Fortunately we shouldn't have to deal with negative values here so floor and truncate mean the same thing. Agreed that isoformat() should also truncate.

    vstinner commented 8 years ago

    Maybe a datetime.round() method along these lines will be a worthwhile addition?

    Sorry, what is the use case of this method?

    abalkin commented 8 years ago

    Personally, I don't rounding is that useful. My working assumption is that users will select say timespec='millisecond' only when they know that their time source produces datetime instances with millisecond precision and they don't want to kill more trees by printing redundant 0's.

    MAL's objection this this line of arguments was that some time sources have odd resolution (he reported MS SQL's use of 333 ms) and a user may want to have a perfect round-tripping when using a sub-usec timespec and such an odd time source or destination.

    vstinner commented 8 years ago

    Personally, I don't rounding is that useful.

    Nice, it looks like I agree with you on using ROUNDING_FLOOR :-)

    I don't think that we should be prepared for theorical user requests, but rather focus on the concrete and well defined current existing user request: "Add timespec optional flag to datetime isoformat() to choose the precision".

    Let's wait until users request a datetime.round() method to understand better concrete issues.

    abalkin commented 8 years ago

    I feel odd trying to advocate a POV that I disagree with, so let me just quote MAL:

    """ In practice you often don't know the resolution of the timing source. Nowadays, the reverse of what you said is usually true: the source resolution is higher than the precision you use to print it. ..

    For full seconds, truncation will add an error of +/- 1 second, whereas rounding only adds +/- 0.5 seconds. This is what convinced me to use rounding instead of truncation. """

    I somehow missed this argument when Marc-Andre made it, so I want to make sure that it is properly considered before we finalize this issue.

    c7164467-54f8-4d7b-90dd-836b0822aef0 commented 8 years ago

    Meanwhile I made corrections after @belopolsky latest review

    abalkin commented 8 years ago

    Alessandro, thank you very much for your work and perseverance. I will do my best to commit this next weekend.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 8 years ago

    New changeset eb120f50df4a by Alexander Belopolsky in branch 'default': Closes bpo-19475: Added timespec to the datetime.isoformat() method. https://hg.python.org/cpython/rev/eb120f50df4a