python / cpython

The Python programming language
https://www.python.org/
Other
61.4k stars 29.58k forks source link

PEP 615: Add zoneinfo module #84683

Closed pganssle closed 5 months ago

pganssle commented 4 years ago
BPO 40503
Nosy @malemburg, @Yhg1s, @abalkin, @serhiy-storchaka, @zooba, @pganssle, @isuruf, @FFY00
PRs
  • python/cpython#19909
  • python/cpython#20006
  • python/cpython#20034
  • python/cpython#28495
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/pganssle' closed_at = None created_at = labels = ['type-feature', 'library', '3.9'] title = 'PEP 615: Add zoneinfo module' updated_at = user = 'https://github.com/pganssle' ``` bugs.python.org fields: ```python activity = actor = 'steve.dower' assignee = 'p-ganssle' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'p-ganssle' dependencies = [] files = [] hgrepos = [] issue_num = 40503 keywords = ['patch'] message_count = 17.0 messages = ['368076', '368461', '368619', '368634', '368649', '368656', '368658', '368800', '369021', '369059', '369206', '401992', '402247', '402330', '402350', '402352', '402357'] nosy_count = 8.0 nosy_names = ['lemburg', 'twouters', 'belopolsky', 'serhiy.storchaka', 'steve.dower', 'p-ganssle', 'isuruf', 'FFY00'] pr_nums = ['19909', '20006', '20034', '28495'] priority = 'high' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue40503' versions = ['Python 3.9'] ```

    pganssle commented 4 years ago

    This is an issue to track the implementation of PEP-615: https://www.python.org/dev/peps/pep-0615/

    It should mostly involve migrating from the reference implementation: https://github.com/pganssle/zoneinfo/

    pganssle commented 4 years ago

    I've separated this into two separate PRs, one for docs and one for tests/implementation.

    I have not yet implemented the logic for the ability to configure the TZPATH at compile time because I'm not quite sure how to start on that. How are other compile-time options implemented? Do I need to do something special to handle both Windows and POSIX systems? Is there already a configuration file somewhere that I should use for this? Adding Thomas to the nosy list because he's the only one listed for "autoconf/makefiles" – don't know if that extends to the Windows build as well.

    Yhg1s commented 4 years ago

    The normal way to do this (for make/autoconf) is to add a --with-tzpath argument to configure.ac, and a make variable to pass it to the compilation of anything that needs it. You can then access it from Python code with sysconfig.get_config_var().

    In configure.ac, AC_SUBST(TZPATH) makes configure replace @TZPATH@ in the Makefile with the value you set to $TZPATH in configure.ac. You then either add that to the global PY_CFLAGS_NODIST, or modify the build rule for the module that needs it to pass it along. (See for example how GITTAG/GITVERSION/GITBRANCH are passed to Modules/getbuildinfo.o.)

    AC_ARG_WITH() is how you add a new --with-* argument to configure. The usual way people do this is by copying one of the other AC_ARG_WITH blocks and modifying it to suit their needs. It's a mixture of m4 and shell that can be a bit annoying to get right, but it's pretty flexible. Run autoreconf to regenerate configure. You can manually check that the shell in configure makes sense.

    Something will have to be done on the Windows side as well, but I'm not sure what. Adding Steve Dower for that.

    pganssle commented 4 years ago

    Thanks Thomas, that was super helpful. I've created python/cpython#64233 to add in the compile-time arguments on POSIX systems at least, do you mind having a look?

    For the moment I have made it non-configurable on Windows, but I think the right thing to do is to add an argument to PCbuild\build.bat. Hopefully Steve can point me in the right direction for mapping that argument (or something else) to a new config variable in sysconfig.

    serhiy-storchaka commented 4 years ago

    Do we need the C implementation if there is the Python implementation?

    pganssle commented 4 years ago

    I mean, theoretically we don't "need" it, but it's much, much faster, and without it nearly every operation that needs time zone offsets will be slower than pytz (which has a mechanism for caching).

    Also, I've already written it, so I see no reason why not use it.

    pganssle commented 4 years ago

    Here are some benchmarks run using the latest implementation. The pure Python code is pretty optimized, but the C code is still \~4-5x faster.

    Running from_utc in zone Europe/Paris c_zoneinfo: mean: 494.82 ns ± 3.80 ns; min: 489.23 ns (k=5, N=500000) py_zoneinfo: mean: 2.48 µs ± 79.17 ns; min: 2.42 µs (k=5, N=100000) dateutil: mean: 10.41 µs ± 209.97 ns; min: 10.17 µs (k=5, N=50000) pytz: mean: 4.69 µs ± 252.70 ns; min: 4.39 µs (k=5, N=50000)

    Running to_utc in zone Europe/Paris c_zoneinfo: mean: 539.61 ns ± 25.68 ns; min: 514.39 ns (k=5, N=500000) py_zoneinfo: mean: 2.01 µs ± 61.69 ns; min: 1.94 µs (k=5, N=100000) dateutil: mean: 7.88 µs ± 506.89 ns; min: 7.25 µs (k=5, N=50000) pytz: mean: 773.02 ns ± 14.11 ns; min: 759.56 ns (k=5, N=500000)

    Running utcoffset in zone Europe/Paris c_zoneinfo: mean: 329.34 ns ± 36.31 ns; min: 302.88 ns (k=5, N=1000000) py_zoneinfo: mean: 1.57 µs ± 9.58 ns; min: 1.55 µs (k=5, N=200000) dateutil: mean: 6.28 µs ± 86.61 ns; min: 6.16 µs (k=5, N=50000) pytz: mean: 461.47 ns ± 2.07 ns; min: 458.91 ns (k=5, N=500000)

    utcoffset() is very likely to be called possibly many times in certain hot loops (including implicitly as it's part of hash and equality checks).

    pganssle commented 4 years ago

    Talked to Steve Dower in a sidebar about the issue with compile-time configuration, he is convinced that compile-time configuration is not something that would be useful or worth doing on Windows. I am indifferent on the matter, so I am fine with calling the compile-time configuration part of this done at this point. If anyone building Python for Windows shows up needing support for this, we can re-visit the issue — I don't believe it's technically infeasible, just that the usage patterns of Python on Windows mean that it's not worth doing.

    So, once we've got a critical mass of reviews done for zoneinfo, I think this feature is done. 🎉

    vstinner commented 4 years ago

    New changeset 62972d9d73e83d6eea157617cc69500ffec9e3f0 by Paul Ganssle in branch 'master': bpo-40503: PEP-615: Tests and implementation for zoneinfo (GH-19909) https://github.com/python/cpython/commit/62972d9d73e83d6eea157617cc69500ffec9e3f0

    pganssle commented 4 years ago

    New changeset b17e49e0def23238b9e7f48c8a02e2d7bbf1f653 by Paul Ganssle in branch 'master': bpo-40503: Add documentation and what's new entry for zoneinfo (GH-20006) https://github.com/python/cpython/commit/b17e49e0def23238b9e7f48c8a02e2d7bbf1f653

    vstinner commented 4 years ago

    I suggest to open a separated issue to discuss how tzdata can be installed on Travis CI, Azure Pipelines, Buildbots, etc. when running tests.

    1b7e0465-9f2c-4663-8ddc-8b935379ec44 commented 2 years ago

    If anyone building Python for Windows shows up needing support for this, we can re-visit the issue — I don't believe it's technically infeasible, just that the usage patterns of Python on Windows mean that it's not worth doing.

    At conda-forge, we need this and our current solution is https://github.com/conda-forge/python-feedstock/blob/8195ba1178041b7461238e8c5680eee62f5ea9d0/recipe/patches/0032-Fix-TZPATH-on-windows.patch#L19

    I can change to check if that directory exists. What do you think?

    zooba commented 2 years ago

    That looks like a relocatable path anyway, which means you wouldn't want to compile it into the binary. Your patch to select a default value at runtime is probably better.

    The compile-time option is meant for a system directory.

    1b7e0465-9f2c-4663-8ddc-8b935379ec44 commented 2 years ago

    Thanks @steve.dower for the info. I've created https://github.com/python/cpython/pull/28495. Let me know if it needs to have a separate bpo issue.

    zooba commented 2 years ago

    I meant I don't think we want that path upstream, and you should keep it as your own patch.

    We don't have a "share" directory as part of CPython, but this would codify it. That would then conflict with your use of it in conda-forge.

    It also makes sysconfig non-deterministic, which is going to be problematic against some other efforts around being able to treat it as static data to make native module compilation easier, as well as more generally something that often leads to security vulnerabilities and is best avoided.

    So on balance, I think we will reject that patch.

    1b7e0465-9f2c-4663-8ddc-8b935379ec44 commented 2 years ago

    Do you have a suggestion for how to make it configurable at compile time? In POSIX platforms, we can set --with-tzpath to make it configurable at compile time.

    zooba commented 2 years ago

    We'd need to implement some form of data store for values on Windows, which currently are not compiled in anywhere, and it would need a substitution syntax to be updated at runtime.

    We're talking a big feature now, and you'd end up having just as much code on your side to make it work.

    The only justification that'd make sense on our side is if we were going to bundle the tzdata file with the Windows installer, which doesn't affect you at all, apart from us then having an existing path listed that you'd want to override still. And even then, we'd probably preinstall the tzdata package which already overrides the path, and so there'd be no changes necessary to sysconfig and so we wouldn't touch it anyway.

    serhiy-storchaka commented 5 months ago

    Should not this issue be closed?

    pganssle commented 5 months ago

    Yeah I think so.