faster-cpython / ideas

1.68k stars 48 forks source link

Deprecate and repeal PEP 509 (Add a private version to dict) aka `ma_version_tag` #461

Closed Fidget-Spinner closed 5 months ago

Fidget-Spinner commented 1 year ago

See https://bugs.python.org/issue44477. PEP 509 is no longer used in CPython, we should repeal it to save space on all dictionaries using ma_version_tag.

I volunteer to write the PEP to repeal PEP 509.

Things the PEP will need to cover:

Some preliminary research tells me that Cython does use it for cached global lookups. We may need to offer a (private, unstable) alternative to looking up globals.

CC @vstinner

gvanrossum commented 1 year ago

Presumably Cython could use the same caching mechanism we now use in the code for LOAD_GLOBAL_MODULE? IIUC it looks at dict->ma_keys->dk_version instead of ma_version_tag.

CC @markshannon

Fidget-Spinner commented 1 year ago

Indeed. However, that would change the semantics of their code (their borrowed reference method would be invalid and they should follow what we do instead, which is to store an index into the dict's values array and access whatever's there).

da-woods commented 1 year ago

I'm not exactly sure at first look what Cython is getting from it (probably just a small optimization) but you can turn off Cython's use of this by defining CYTHON_USE_DICT_VERSIONS=0. So it's a very easy fix if you decide to remove it.

(If there's an alternative optimization mechanism to do the same thing we'd be happy to hear about it of course)

da-woods commented 1 year ago

And if CPython is no longer using it then we should probably turn it off for those versions anyway, since the "optimization" is probably useless (even if it isn't yet removed).

Edit: looking in a bit more detail, it probably still works as an optimization for now. But it's easily removable, and doesn't even need existing .c files to be regenerated.

vstinner commented 1 year ago

My first motivation for PEP 509 was my https://faster-cpython.readthedocs.io/fat_python.html project which I abandonned since that time. The second motivation which convinced me to keep the PEP was the LOAD_GLOBAL optimization in Python 3.8 (it used the dict version).

If LOAD_GLOBAL is still optimized in Python 3.12 without using the dict version, I'm fine with getting ride of the dict version. The current PEP 659 design is more about dict key version, no?

gvanrossum commented 1 year ago

LOAD_GLOBAL in 3.11 and (so far 3.12) uses the dict keys version; see LOAD_GLOBAL_MODULE and LOAD_GLOBAL_BUILTIN in ceval.c. IIUC this deoptimizes less frequently -- checking the keys version only deopts when a new global was added, whereas checking the dict version would deopt whenever any global was given a new value. E.g. this:

import sys
def f():
    global x
    for x in range(1000000):
        a = sys

ends up deoptimizing the old LOAD_GLOBAL sys instruction on each iteration because the global x was changed. The 3.11 version stays optimized because globals() has a fixed set of keys (sys, f and x).

markshannon commented 1 year ago

A bit more background. A dict object takes 8 machine words and we allocate objects as multiples of 2 words, so removing ma_version_tag doesn't save any memory. Most objects do not have a dictionary, so do not need to update the version number.

However, freeing up the space means that we can add dictionary watchers without needing more space. Dictionary watchers will be important for higher tier optimizers. So the immediate benefit is small, but in the longer term removing ma_version_tag is valuable.

Fidget-Spinner commented 1 year ago

@da-woods and @carljm

Could I have both of your blessings for PEP 699 please? Specifically for Cython to support the backwards incompatible change, and for Cinder to support the utility provided by reusing the space.

Also @vstinner may I obtain your blessing for PEP 699 please? As the original author of PEP 509 your opinion is important.

Thank you! https://discuss.python.org/t/pep-699-remove-private-dict-version-field-added-in-pep-509/19724

carljm commented 1 year ago

@Fidget-Spinner commented on the discourse thread.