[ENH] Subinterpreter Support and 3.14 / InterpreterPoolExecutors

paultiq commented 4 days ago

Is your feature request related to a problem? Please describe.

InterpreterPoolExecutor's are to be introduced in 3.14, and backported to 3.13 https://github.com/python/cpython/pull/124548, backport. Cython does not support subinterpreters, thus do not work with an InterpreterPoolExecutor.

Importing in a subinterpreter yields: "ImportError: module Cython.Utils does not support loading in subinterpreters".

It looks like a lot of work was already done in the past to prepare for subinterpreters, by implementing CYTHON_USE_MODULE_STATE. Is this work ongoing? Is there a roadmap to finalize support for subinterpreters?

This impacts downstream packages, such as https://github.com/apache/arrow/issues/42151#issuecomment-2189528499

In my code, I would like to use cython functions inside an InterpreterPoolExecutor.

The following code will raise: "ImportError: module Cython.Utils does not support loading in subinterpreters"

from interpreters_backport.concurrent.futures.interpreter import InterpreterPoolExecutor

with InterpreterPoolExecutor() as executor:
    r=executor.submit(exec, "import Cython.Utils")
    r.result()

** This example was using 3.13 with backported version: https://pypi.org/project/interpreters-pep-734/

Describe the solution you'd like.

No response

Describe alternatives you've considered.

No response

Additional context

No response

da-woods commented 4 days ago

There was a short discussion on the user's group last week https://groups.google.com/g/cython-users/c/O0nYBTTwb_Y. Only a very small amount has changed since then.

In summary, you can run Python in a way where it ignores the modules' declared compatibility. If you build your cython module with the module state flag enabled then its marginally possible that it might work. I don't think we've currently tested it or even put any thought into how to test it.

I expect there to be some pretty serious limitations even when finished. For example, with gil: is something that I expect can never be made to work.

paultiq commented 3 days ago

Thank you for the prompt reply and awesome work.

I get that user-defined global variables will introduce some indeterminable issues and footguns. That seems inevitable, but also in our control.

My question, iiuc, is really about CYTHON_USE_MODULE_STATE and two-phase init. You said it's a "long term project"... does that suggest it's a long way from being even testable?

(and, if this is the wrong forum, I'm happy to just reply in the user group thread)

da-woods commented 3 days ago

CYTHON_USE_MODULE_STATE and two-phase init.

I don't think you necessarily need that - I think single phase init might well work if you turn off the check that bans it. I certainly wouldn't dismiss trying it in favour of waiting.

But in terms of two-phase init, we've done some restructuring so most data (except global cdef variables right now) is stored in a "module state struct" (in any compilation mode... There's just a single global variable instance of the struct most of the time). But the one thing we haven't done is implemented any of the module lookup mechanisms.

So I can absolutely promise that won't work right now.

paultiq commented 3 days ago

Thanks, did a little testing with a cythonized module to start, and my computer didn't blow up. Just wanted to document the test case here for other people and how to disable the multi_interp_extensions_check.

After recompiling with CYTHON_USE_MODULE_STATE = 1, ran the following code and results were correct.

Both fib, and a function using a global variable (a list), worked as expected. The global variable was only updated within the scope of the subinterpreter (each subinterpreter had a separate list).

_override_multi_interp_extensions_check

Define this in a separate module to disable the multi_interp_extensions_check:

def disable_multi_interp_extensions_check():
    import _imp
    _imp._override_multi_interp_extensions_check(-1)

Test Case, using

try:
    from concurrent.futures.interpreter import InterpreterPoolExecutor # 3.14+
except ModuleNotFoundError:
    from interpreters_backport.concurrent.futures.interpreter import InterpreterPoolExecutor # Backport / https://pypi.org/project/interpreters-pep-734/

from mycythonmodule import fib
from mydisablemodule import disable_multi_interp_extensions_check

with InterpreterPoolExecutor(max_workers=5, initializer = disable_multi_interp_extensions_check) as executor:
    ipe_r=executor.map(fib, range(100))
    print(list(ipe_r))

Fib

def fib(n: int) -> int:
    a, b = 0, 1 
    while b < n:
        a, b = b, a + b
    return a

Uses Global Function

import _interpreters 

SOMEGLOBAL = []
def uses_global(x) -> tuple:
    SOMEGLOBAL.append(x)
    return _interpreters.get_current(), len(SOMEGLOBAL)   # returns the subinterpreter id

da-woods commented 2 days ago

Thanks for testing - that's good to know. It also bodes well for when we do manage to make it work "properly" since much of it will remain the same.

For what it's worth I think it'd have failed if you'd made SOMEGLOBAL a cdef variable.

paultiq commented 2 days ago

Yah, indeed. The per-subinterpreter consistency goes away with cdefs.

With a `cdef list`

cdef list SOMEGLOBAL5 = []
def cdef_list_mod(x):
    SOMEGLOBAL5.append(x)
    print(_interpreters.get_current(), SOMEGLOBAL5)

Inconsistent result w a cdef list: ** Inconsistent as in: subinterpreters see side effects from other subinterpreters

(2, 5) [1]
(2, 5) [1, 2]
(2, 5) [3]
(2, 5) [3, 4]
(2, 5) [3, 4, 5]
(2, 5) [3, 4, 5, 6]
(2, 5) [3, 4, 5, 6, 7]
(2, 5) [3, 4, 5, 6, 7, 9]
(10, 5) [10]
(3, 5) [12]
(6, 5) [12, 13]
(3, 5) [12, 13, 19]
(5, 5) [8]
(9, 5) [14]
(4, 5) [15]
(7, 5) [16]
(8, 5) [17]

Sane results w a python list (no cdef): ** What's "sane" is that each subinterpreter has a consistent sequence of values... similar to running the same code in processes / a ProcessPoolExecutor.

(1, 5) [1]
(1, 5) [1, 5]
(2, 5) [0]
(1, 5) [1, 5, 6]
(2, 5) [0, 8]
(4, 5) [2]
(2, 5) [0, 8, 10]
(5, 5) [3]
(1, 5) [1, 5, 6, 9]
(4, 5) [2, 11]
(1, 5) [1, 5, 6, 9, 14]
(5, 5) [3, 13]
(2, 5) [0, 8, 10, 12]
(4, 5) [2, 11, 15]
(1, 5) [1, 5, 6, 9, 14, 16]
(5, 5) [3, 13, 17]
(2, 5) [0, 8, 10, 12, 18]
(4, 5) [2, 11, 15, 19]
(3, 5) [4]
(6, 5) [7]

With a `cdef int`

cdef int SOMEGLOBAL4 = 0
def uses_global_cdef(x):
    global SOMEGLOBAL4

    SOMEGLOBAL4+=1
    print(_interpreters.get_current(), SOMEGLOBAL4)

Results using r = executor.map(uses_global_cdef, range(20)) (the first tuple is the subinterpreter id):

(3, 5) 1
(3, 5) 2
(3, 5) 3
(3, 5) 4
(2, 5) 1
(1, 5) 1
(3, 5) 2
(2, 5) 3
(1, 5) 4
(3, 5) 5
(2, 5) 6
(1, 5) 7
(3, 5) 8
(2, 5) 9
(1, 5) 10
(3, 5) 11
(9, 5) 1
(4, 5) 1
(7, 5) 1
(5, 5) 2

cython / cython