ericsnowcurrently commented 4 months ago

Bug report

Bug description:

When an extension module is imported the first time, we load the shared-object file and get the module's init function from it. We then run that init function and use the returned object to decide if the module is single-phase init or multi-phase init.

For isolated subinterpreters, where PyInterpreterConfig.check_multi_interp_extensions (AKA Py_RTFLAGS_MULTI_INTERP_EXTENSIONS) is True, we immediately fail single-phase init modules. The problem is that at that point the init function has already run, which all sorts of potential side effects and process-global state (including registered callbacks) that we mostly can't clean up.

This has come up before, for example with the readline module. It's potentially a bigger problem than I thought at first, so I'd like to get it sorted out for 3.13.

FWIW, the simplest solution I can think of is to always call the module init func from the main interpreter (without necessarily doing all the other import steps there). That would look something like this:

start a normal import in an isolated subinterpreter
get the init function
switch to the main interpreter
call the init function
switch back
fail if it is single-phase init (and remember that fact)

For the main interpreter and non-isolated subinterpreters, nothing different would happen from now; there would be no switching. Also, if the first attempt was in an isolated interpreter (which would fail), a subsequent import of that module in the main interpreter (or a non-isolated one) would succeed.

The only tricky part is, when the init function raises an exception, how to an propagate it from the main interpreter to the subinterpreter. For multi-phase init (if known) we would just call the init func again after switching back. For single-phase init (or unknown) we'd have preserve the exception somehow. This is something I had to deal with for _interpreters.exec(), but I'm not sure the same thing will work here.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

gh-117487
gh-118116
gh-118156
gh-118157
gh-118167
gh-118192
gh-118193
gh-118194
gh-118203
gh-118204
gh-118205
gh-118206
gh-118250
gh-118532
gh-118684
gh-120689
gh-120707

encukou commented 3 months ago

Continuing here from the closed PR. @ericsnowcurrently wrote:

Thanks for bringing [m_slots=NULL] up. What you've said makes sense and appreciate the clarity of it. There are indeed some gaps (fairly small, I expect), both functionally and in tests. It isn't clear what the impact is in practice, but I definitely think they should be addressed regardless. I'll be sure to open some issues in the next week or two.

Thanks!

AFAICS, the current state is that is_singlephase() function might return true on some multi-phase modules. And it's currently only used in asserts, where assert(is_singlephase(...)) is fine, but assert(!is_singlephase(...)) runs into the false positive. The current use of the latter is in create_builtin, which is AFAIK only used modules that don't have m_slots=NULL. Another use in #118203 broke a test.

IMO, current PRs should simply avoid assert(!is_singlephase); in another PR the function should become assert_is_singlephase(), to limit future uses to the case that works.

FWIW, the general thought of kinds of extension modules was already on my mind (and a bit clearer in my original mega-PR, https://github.com/python/cpython/pull/118116). For now I have a later PR (https://github.com/python/cpython/pull/118205) that is a bit more deliberate about keeping track of what the init function returns. While that PR doesn't do so currently, I might look at explicitly tracking the kind on the module def. That would help address the point you've brought up.

Yes, looks like this tracking will be needed in order to use this for more than asserts. I guess “tracking the kind on the module def” is a typo and you meant “he kind of the module def”? (PyModuleDef is part of the stable ABI, and it doesn't have space for more data.)

ericsnowcurrently commented 3 months ago

(PyModuleDef is part of the stable ABI, and it doesn't have space for more data.)

Right. Tracking it *on the module def would require hiding the bit(s) in one of the existing fields. That's doable, but I'd do that only if there wasn't another good place to stick the info.

ericsnowcurrently commented 3 months ago

The core change has landed, but there are a few small follow-up things to wrap up.

ericsnowcurrently commented 2 months ago

FYI, in gh-118157 I disabled test_interpreters under Py_GIL_DISABLED. I did so because of failures on free-threading builds in CI:

(expand for an example)

``` 0:08:30 load avg: 5.18 Re-running 1 failed tests in verbose mode in subprocesses 0:08:30 load avg: 5.18 Run 1 test in parallel using 1 worker process (timeout: 10 min, worker timeout: 15 min) 0:08:30 load avg: 5.18 [1/1/1] test_interpreters failed (1 failure) Re-running test_interpreters in verbose mode (matching: test_display_preserved_exception) test_display_preserved_exception (test.test_interpreters.test_api.TestInterpreterExec.test_display_preserved_exception) ... FAIL ====================================================================== FAIL: test_display_preserved_exception (test.test_interpreters.test_api.TestInterpreterExec.test_display_preserved_exception) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/support/__init__.py", line 2603, in wrapper return func(*args, **kwargs) File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/test_interpreters/test_api.py", line 764, in test_display_preserved_exception self.assertEqual(stderr, dedent(f"""\ ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ Traceback (most recent call last): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...<18 lines>... RuntimeError: uh-oh! ^^^^^^^^^^^^^^^^^^^^ """)) ^^^^^ AssertionError: 'Trac[778 chars]oh!\nException ignored in: interp.exec(script) ~~~~~~~~~~~^^^^^^^^ File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/support/interpreters/__init__.py", line 227, in exec raise ExecutionFailed(excinfo) test.support.interpreters.ExecutionFailed: RuntimeError: uh-oh! Uncaught in the interpreter: Traceback (most recent call last): File "/tmp/test_python_thv394xy/tmpsdujy1zv/script.py", line 6, in script spam.eggs() ~~~~~~~~~^^ File "/tmp/test_python_thv394xy/tmpsdujy1zv/spam.py", line 6, in eggs ham() ~~~^^ File "/tmp/test_python_thv394xy/tmpsdujy1zv/spam.py", line 3, in ham raise RuntimeError('uh-oh!') RuntimeError: uh-oh! - Exception ignored in: - Traceback (most recent call last): - File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/support/interpreters/__init__.py", line 157, in __del__ - File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/support/interpreters/__init__.py", line 173, in _decref - TypeError: catching classes that do not inherit from BaseException is not allowed - Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters - Python runtime state: finalizing (tstate=0x0000562d95495a10) - ---------------------------------------------------------------------- Ran 1 test in 0.285s FAILED (failures=1) test test_interpreters failed 1 test failed again: test_interpreters == Tests result: FAILURE then FAILURE == ```

The tests should be re-enabled and made to work before this issue be considered resolved.

python / cpython

Single-Phase Init Extension Module Init Functions Still Run in Isolated Interpreters #117953

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

Linked PRs