python / cpython

The Python programming language
https://www.python.org
Other
62.71k stars 30.06k forks source link

Isolate Stdlib Extension Modules #103092

Open ericsnowcurrently opened 1 year ago

ericsnowcurrently commented 1 year ago

See PEP 687.

Currently most stdlib extension have been ported to multi-phase init. There are still a number of them to be ported, almost entirely non-builtin modules. Also, some that have already been ported still have global state that needs to be fixed.

(This is part of the effort to finish isolating multiple interpreters from each other. See gh-100227.)

High-Level Info

How to isolate modules: https://docs.python.org/3/howto/isolating-extensions.html (AKA PEP 630).

The full list of modules that need porting can be found with: ...

The full list of remaining (unsupported) global variables is:

A full analysis of the modules may be found at the bottom of this post.

(other info) #### Previous Work * https://github.com/python/cpython/issues/84258 * _xxsubinterpreters [[issue](https://github.com/python/cpython/issues/99741)] [[PR](https://github.com/python/cpython/pull/99742)] * _testinternalcapi [[issue](https://github.com/python/cpython/issues/100997)] [[PR](https://github.com/python/cpython/pull/100998)] * ... #### Related Links * gh-85283 * https://discuss.python.org/t/a-new-c-api-for-extensions-that-need-runtime-global-locks/20668 * https://discuss.python.org/t/how-to-share-module-state-among-multiple-instances-of-an-extension-module/20663

TODO

Here is the list of modules that need attention, in a rough, best-effort priority order. Additional details (e.g. if there is an issue and/or PR) is found in the analysis table at the bottom.

The above does not include test modules. They don't need to be ported/isolated (except for a few which already have been).


Modules Analysis

module builtin Windows PEP 594 issue PR ported # static types # other global objects # other globals
_asyncio yes 2
_collections X (???) (branch) yes 7
_ctypes **NO** 37 6 4
_curses **NO** 1 2 4
_curses_panel yes 1
_datetime gh-71587 gh-102995 **NO** 7 10 1
_decimal **NO** 4 10 6
_elementtree yes 1
_io X gh-101819 gh-101520 **NO** 5
_lsprof yes 2
_multibytecodec yes 23
_pickle (???) (yes) **NO** 5
_socket **NO** 1 2 3
_ssl (???) (branch) yes 1
_tkinter **NO** 8 9
_tracemalloc X gh-101520 **NO** 6 7
array yes 1
faulthandler X gh-101509 yes 3+ ~22+
msvcrt Y **NO** ??? ??? ???
pyexpat yes 1
readline **NO** 9
winreg Y **NO** ??? ??? ???
winsound Y **NO** ??? ??? ???
test/example modules These can be ported/isolated but don't have to be. They are the lowest priority. | module | issue | PR | ported | # static types | # other global objects | # other globals | | --- | --- | --- | :---: | ---: | ---: | ---: | | xxmodule | | | | [3](https://github.com/python/cpython/blob/main/Tools/c-analyzer/cpython/globals-to-fix.tsv#L401-L403) | [1](https://github.com/python/cpython/blob/main/Tools/c-analyzer/cpython/globals-to-fix.tsv#L427) | | | xxsubtype | | | | [2](https://github.com/python/cpython/blob/main/Tools/c-analyzer/cpython/globals-to-fix.tsv#L404-L405) | | | | xxlimited_35 | | | | | [2](https://github.com/python/cpython/blob/main/Tools/c-analyzer/cpython/globals-to-fix.tsv#L416-L426) | | | ... | | | | | | | | ... | | | | | | |

Linked PRs

ericsnowcurrently commented 1 year ago

Ideally we'll get most of this done for 3.12. FWIW, there isn't enough time if it's just me, so consider this a call out to whoever can help. 😄 Feel free to adjust the TODO list and (hidden) "analysis" table above. Also feel free to reach out to any others that might be interested in pitching in here. Thanks!!!

@erlend-aasland, @corona10, @kumaraditya303, etc.

ericsnowcurrently commented 1 year ago

@terryjreedy (for _tkinter)

erlend-aasland commented 1 year ago

I've got WIP branches for datetime, io, pickle, collections, and ssl. The three former are up as draft PRs.

(FYI, I have quite a bit of CPython time scheduled for next week.)

ericsnowcurrently commented 1 year ago

@markshannon

erlend-aasland commented 1 year ago

Regarding the freelists in _asyncio, Kumar and Guido had some thoughts over at https://github.com/python/cpython/issues/91375#issuecomment-1401757177.

erlend-aasland commented 1 year ago

Elementtree was isolated/ported in gh-92123.

erlend-aasland commented 1 year ago

~_lsprof is also done, AFAICS.~ ... just noticed rotatingtree.c; sorry 'bout the noise.

ericsnowcurrently commented 1 year ago

Thanks for the updates!

terryjreedy commented 1 year ago

@serhiy-storchaka for _tkinter. I have nothing to do with it.

erlend-aasland commented 1 year ago

AFAICS, array should be fine; array_reconstructor is in the module state already.

corona10 commented 1 year ago

I will try to find which modules that I can support for this issue :)

erlend-aasland commented 1 year ago

I will try to find which modules that I can support for this issue :)

Feel free to pick up any of my draft PRs! :)

CharlieZhao95 commented 1 year ago

I tried submitting some PRs(Port _datatime module, Convert _pickle module to use heap types) for multi-phase initialization last year (delayed due to PEP 687 not being accepted at that time). Maybe now that I can start something new! Can I try to work for _decimal module, if no one has claimed it yet.

shihai1991 commented 1 year ago

I tried submitting some PRs(Port _datatime module, Convert _pickle module to use heap types) for multi-phase initialization last year (delayed due to PEP 687 not being accepted at that time).

Looks like you can reopen your PRs :)

erlend-aasland commented 1 year ago

Maybe now that I can start something new! Can I try to work for _decimal module, if no one has claimed it yet.

Sure you can! I started doing some work on _decimal some weeks ago, but it is very far from complete; feel free to start with that branch :) https://github.com/erlend-aasland/cpython/tree/isolate-decimal

kumaraditya303 commented 1 year ago

I volunteer to review the PRs as I have been doing, and leave _asyncio for me to handle.

ericsnowcurrently commented 1 year ago

FYI, I asked @Yhg1s (as 3.12 release manager) for his disposition on what changes are acceptable after beta 1, since there's certainly a chance we won't be able to finish with the above list in time. He gave me the following guidance related to the work here on extension modules:

Feel free to clarify/correct my summary, Thomas. 😄

With the above in mind, I'd like to be sure we have the proper priority for the modules that are left to port/isolate. To that end, here are some questions:

Again, ideally we will be able to wrap up this work in time for beta 1 (May 8).

erlend-aasland commented 1 year ago

For readline, I'm wondering if we should use a similar strategy as with signal and faulthandler. Thoughts?

aisk commented 1 year ago

I had took a look for _ssl, the only global state is a lock to prevent concurrency writes on log files, and according to its document: https://github.com/python/cpython/blob/8317d51996e68d8bb205385c1d47a9edcd128e7b/Modules/_ssl/debughelpers.c#L131-L136, the lock should be per process. Make it per interpreter will leads to concurrency issue.

Should we just leave it as it is?

ericsnowcurrently commented 1 year ago

We're probably okay leaving _ssl as it is then, for now. The risk is in a race on creation of the lock, which is quite unlikely but still possible.

FWIW, this is related to https://discuss.python.org/t/a-new-c-api-for-extensions-that-need-runtime-global-locks/20668 (and the more general https://discuss.python.org/t/how-to-share-module-state-among-multiple-instances-of-an-extension-module/20663).

arhadthedev commented 1 year ago

Looking at https://github.com/python/cpython/blob/411b1692811b2ecac59cb0df0f920861c7cf179a/Modules/faulthandler.c#L1185-L1242 I've got a question: should I port faulthandler to Argument Clinic first?

ericsnowcurrently commented 1 year ago

IMHO, it isn't critical. The bigger priority is interpreter isolation for 3.12. That said, do you think porting it to Argument Clinic needs to be done first in order to do the isolation work?

arhadthedev commented 1 year ago

That said, do you think porting it to Argument Clinic needs to be done first in order to do the isolation work?

No, so I agree to move it into 3.13.

erlend-aasland commented 1 year ago

FTR, I'm working on multibytecodec today. PR coming tomorrow.

ericsnowcurrently commented 1 year ago

@nanjekyejoannah

erlend-aasland commented 1 year ago

I started with some low-hanging fruit on ctypes the other day. Kumar just reviewed it, and I merged the first part today. Now, for the next PRs, ctypes gets really interesting. There's some interesting metaclass/dict/baseclass hacks there. I have a draft for part 2 up (gh-103932), trying to use the same approach as we used for the defaultdict type when we isolated the collections module a few weeks ago, but I need some more eyes on it, so if you have time, please chime in :)

ericsnowcurrently commented 1 year ago

@erlend-aasland, @arhadthedev, @corona10, and anyone else porting/isolating extensions...

First of all, thanks for all the great work! It's a big deal. ❤️

Second, what's the current status of the remaining modules. If possible, please update the checklist and table in the original post.

ericsnowcurrently commented 1 year ago

To anyone porting/isolating modules, I've added (in gh-104205) a module def slot to use to indicate support for per-interpreter GIL. You'll add the following entry to the module's PyModuleDef.m_slots:

{Py_mod_multiple_interpreters, Py_MOD_PER_INTERPRETER_GIL_SUPPORTED}

I've added it to all the already isolated modules I'm aware of, and left notes for each of the modules that's been ported but still isn't fully isolated.

erlend-aasland commented 1 year ago

FTR, Kumar, Victor and I have been making great progress over at #101948 (#101819) lately. We'll probably be able to land the last PR for isolating _io today (or at least, in the very near future).

kumaraditya303 commented 1 year ago

_io is ported now!

erlend-aasland commented 1 year ago

What's missing with pyexpat/elementtree?

brandtbucher commented 1 year ago

I suggest we leave readline alone. I've been struggling with trying to port/isolate it for a while now, and honestly, it's just too... global.

For context: readline interacts with the global state of the wrapped GNU readline library, and most of its functionality is exposed via a registered PyOS_ReadlineFunctionPointer callback that has no way of recovering the module's state. Whenever any of the hooks are called, they need to get both C data and Python objects out of thin air.

So it really doesn't make any sense for it to be loaded multiple times in the same interpreter, and it's dangerous to load it in a subinterpreter. Since that's the case, I think it's totally fine (maybe even desirable) to leave it as single-phase init... especially since it's not even present in all builds, so most users need to be prepared for an ImportError anyways.

tkinter seems to be sort of in the same boat. For more context on the problematic PyOS_InputHook and PyOS_ReadlineFunctionPointer APIs, see #104668.

erlend-aasland commented 1 year ago

We could solve them using interpreter state and locking, but I'm not sure it is worth it. Also, I'm not sure there's a real use case for readline or tkinter and subinterpreters.

kumaraditya303 commented 1 year ago

The beta freeze is tomorrow and the important modules are already ported over. The rest can be done in 3.13, we should maybe add a whatsnew entry for this, that so so modules are isolated and are compatible with subinterpreters something something.

ericsnowcurrently commented 1 year ago

For 3.12, we should still try to get the following done:

Note that we do have until beta 2.

FWIW, I'm fine with leaving _tkinter, readline, and _curses/_curses_panel alone, since to me they only make sense in the main thread of the main interpreter.

brandtbucher commented 1 year ago

Note that we do have until beta 2.

Just a heads-up, this is only one week away...

brandtbucher commented 1 year ago

We could solve them using interpreter state and locking, but I'm not sure it is worth it. Also, I'm not sure there's a real use case for readline or tkinter and subinterpreters.

Yeah, I think that if this is worth fixing, it's probably just easier to introduce a better API that third-party extensions can use, rather than using the interpreter state as a dumping ground for readline/tkinter state.

arhadthedev commented 1 year ago

3.13 development launched so Deprecated (lower priority) modules are to be gone in gh-104773. I removed them from the issue list and the Modules Analysis table.

erlend-aasland commented 1 year ago

FTR, _datetime is almost ready. I've got a wip branch for _ctypes, but there's lots left there.

CharlieZhao95 commented 1 year ago

The _decimal module is also nearing completion, it currently passes all test cases. But I believe it's still not final, if anyone is free please help to review it. Thanks to all!

erlend-aasland commented 1 year ago

I recon we want to leave _tkinter as is, for the same reasons as for readline and _curses. cc. @ericsnowcurrently

ericsnowcurrently commented 1 year ago

makes sense

vstinner commented 12 months ago

Statistics on C types in the main branch:

$ grep -E 'static PyTypeObject .* =' $(find -name "*.c"|grep -v Doc/)|wc -l
59
$ grep -E 'PyType_Spec .* =' $(find -name "*.c"|grep -v Doc/)|wc -l
207

Statistics on C extensions in the main branch:

vstinner@mona$ grep 'PyModuleDef_Init *(' $(find -name "*.c"|grep -v Doc/)|wc -l
120
vstinner@mona$ grep 'PyModule_Create *(' $(find -name "*.c"|grep -v Doc/)|wc -l
21
neonene commented 6 months ago

@erlend-aasland I'd like _datetime static types to be heap types to help #113601. Would you mind if I open an issue and take the type conversion part from your POC (without putting them into some states)?

vstinner commented 6 months ago

@erlend-aasland I'd like _datetime static types to be heap types to help https://github.com/python/cpython/pull/113601. Would you mind if I open an issue and take the type conversion part from your POC (without putting them into some states)?

Converting static types to heap types is not the blocker issue for the datetime. The problem of datetime is its C API.

It reminds me the problem of the AST C API. We tried different things but we had multiple crashes. At the end, I moved AST types to the PyInterpreterState, they are initialized at the first access. This way, from any code path, you always meet the same types. Before, depending on how types were created, you might meet different types and so PyAST_Check() returned false which was very surprising.

I can help you if you open an issue. But please, first, move definition of these types to PyInterpreterState (put PyTypeObject* pointers there), in a new "datetime" state even if it's shared library.

neonene commented 6 months ago

Maybe already known, but reloading any module (e.g.-R) seems to be possibly unsafe:

import sys
from _decimal import Decimal as D1
del sys.modules['_decimal']
from _decimal import Decimal as D2

print(a := D1(1), b := D1(1), a == b, a is b)  # load
print(c := D2(1), d := D2(1), c == d, c is d)  # reload
print(a == c)
encukou commented 6 months ago

What do you mean by unsafe? If something's unsafe at the C level (e.g. undefined behaviour, crashes), that's a bug.

This behaviour is known; see https://docs.python.org/3/howto/isolating-extensions.html#surprising-edge-cases and the warning in https://docs.python.org/3/library/sys.html#sys.modules

neonene commented 6 months ago

What do you mean by unsafe?

Thanks. I suspect _datetime's crashes with multi-phase init come from incorrect state accesses from binary (or ternaly) functions. I also think the init check in test_capi should be reconsidered. However, I agree to use PyInterpreterState if we need to check multiple module states. I have not fully investigated, though.

EDIT: The _datetime crash can be seen in the refleak test.

neonene commented 6 months ago

I've opened a _datetime issue: #117398

ericsnowcurrently commented 5 months ago

@Yhg1s, could we get an exception for 3.13 like we got for 3.12, regarding porting stdlib extensions to multi-phase init? You let us land those changes up to beta 2.

This time around there are only two modules left to be done, and the PRs for both are close enough that I expect they could be wrapped up before beta 2 (end of May).