Support sub-interpreters

acminor commented 4 years ago

Does pyo3 allow the use case of starting multiple separate interpreters? This would be similar to Python's multiprocessing.

thedrow commented 4 years ago

It should be possible with the upcoming improvements in 3.8 and above.

reem commented 4 years ago

Specifically would be amazing if it was possible to create multiple separate python interpreters that could be run on different threads in parallel, but which share the same memory space (with the type system used to ensure this is only observable from rust).

davidhewitt commented 4 years ago

The complexity with sub-interpreters is that no Python objects should ever be shared between them. The pyO3 API isn't set up to prevent this at the moment, so it's going to take a fair whack of experimentation before we get anything stable in place.

clouds56 commented 1 year ago

I'd like to have multiple thread, each one has an interpreter. No PyObject would be send between thread. IIRC, we couldn't hold a PyObject in rust, everything is ref to UnsafeCell and by default not send.

mohitreddy1996 commented 1 year ago

@davidhewitt with https://peps.python.org/pep-0684/, using sub interpreters or multiple interpreters to unlock true multi-core parallelism becomes possible.

Is adding support for this in pyo3 timelines or consideration?

AnatolyBuga commented 1 year ago

@davidhewitt with https://peps.python.org/pep-0684/, using sub interpreters or multiple interpreters to unlock true multi-core parallelism becomes possible.

Is adding support for this in pyo3 timelines or consideration?

Found this interesting article on current usage of sub interpreters in python (no rust there)

davidhewitt commented 1 year ago

We are very aware of the per-intepreter parallelism landing in Python 3.12. There are significant changes which need to happen to PyO3's current implementation to support this correctly. We have been discussing some of these challenges in multiple discussions across this repo, such as #2885 which looks at the possible nogil option.

There are several main issues which are prominent in my mind, although others may exist:

I understand interpreters cannot share Python objects. This implies that Py<T> needs to be removed or reworked significantly, maybe by removing Send and Sync from that type, probably also by somehow making the operation to attach Py<T> to a Python thread to be unsafe or runtime-checked in some way.
We need to fully transition PyO3 to PEP 630 compatibility, which requires elimination of all static data which contains Python state. This is probably linked to the first bullet.
- APIs like GILOnceCell and GILProtected can no longer be Sync if multiple GILs exist. Transition to PEP 630 compatibility will probably force us to replace these types with alternative solutions.

Solving these problems is likely to create significant churn of PyO3's API, so we can only make progress once someone has proposed a relatively complete solution which we can adopt with a suitable migration path for users.

letalboy commented 10 months ago

Hello guys, I was redirected here by @messense. I'm building a lib here, and I need to use multiple py compilers at once, so I tried to do it, at first I faced the same issue that py objects can't be shared by threads, so I tried a different approach I created a gill pool inside the thread, then I get a python and did some stuff with it processing some data. It has worked for some time until I start to notice that some threads were randomly crashing, with some Forensic analysis of what was going on I find that the problem is because when we are assuming gill as acquired it basically takes the gill of some thread that is using it, and then the "reference" was gone, this is what I made:

    let getting_py = unsafe { Python::assume_gil_acquired() };
    let gil_pool = unsafe { getting_py.clone().new_pool() };
    py = gil_pool.python();

However, when it happened I switch my lib to call callbacks using only one python compiler at once for now, what isn't optimized, but I have to make the project keeps going, however I continue to try to find a solution for this because it is something that I really need to speed up things here. So someone can please explain better to me or reference me to some trusted article to I understand better how the gill aquire and python compiler inside rust works and if it needs to keep acquired as a reference, because I can potentially have a solution in mind for this that temporarily will work if what I'm with mind really makes sense

letalboy commented 10 months ago

Ok, I have cloned the repo here, and study how it works, to start to understand how the logic is going, however I'm not 100% sure of how all this is working because it is too much code to be honest haha. Here I look something interesting:

In GIL.rs have this python interpreter pool that seam to acquire a GIL pool and then release it:

#[cfg(not(PyPy))]
pub unsafe fn with_embedded_python_interpreter<F, R>(f: F) -> R
where
    F: for<'p> FnOnce(Python<'p>) -> R,
{
    assert_eq!(
        ffi::Py_IsInitialized(),
        0,
        "called `with_embedded_python_interpreter` but a Python interpreter is already running."
    );

    ffi::Py_InitializeEx(0);

    // Safety: the GIL is already held because of the Py_IntializeEx call.
    let pool = GILPool::new();

    // Import the threading module - this ensures that it will associate this thread as the "main"
    // thread, which is important to avoid an `AssertionError` at finalization.
    pool.python().import("threading").unwrap();

    // Execute the closure.
    let result = f(pool.python());

    // Drop the pool before finalizing.
    drop(pool);

    // Finalize the Python interpreter.
    ffi::Py_Finalize();

    result
}

The idea here is a pool, and we be able to acquire python from here and then release. Like a pool of connections of a sqlite3, wright?

So the issue seams to be that when we acquire the GIL pool it creates a conn with the interpreter and I'm assuming that at the moment we only can have one of those...

I've been considering a new approach to tackle our challenges with multithreading and Python's Global Interpreter Lock (GIL) for the moment until we can have multiple sub interpreters. My idea is to create a centralized execution pool dedicated to handling Python-related tasks. This would eliminate the need for using arc and mutex to share PyObjects, avoiding the issues we've faced with sending certain objects. We could develop a procedural macro to wrap the Python-invoking code. This macro would package the code, forward it to the centralized pool using a Box, process it, and return the result. Centralizing the pool means we can manage the GIL more efficiently, reducing errors from multiple threads trying to access it simultaneously. While there's a potential bottleneck with a single interpreter, it offers the advantage of invoking Python from different places without GIL acquisition challenges. The primary shift here is that we send the code for execution instead of transferring PyObjects, ensuring the GIL is safely managed. This approach would essentially streamline our execution into a rapid, queue-based system. I'd be eager to hear your feedback on this idea and if it can potentialy work!

davidhewitt commented 10 months ago

The "pool" you refer to above is not a pool of GIL acquisitions, but rather a pool of objects. (There can only ever be one GIL acquisition at a time per interpreter. As per the above in this thread, PyO3 is some way off supporting sub-interpreters.)

If I read your idea correctly it seems like you're proposing having one thread which is running Python workloads and you send jobs to it from other threads. That seems like a completely reasonable system architecture.

letalboy commented 10 months ago

Yeah, I'm kind lost in the PyO3 files, trying to understand how it works to see if I can help with this, but the idea is exactly that, also while I'm understanding how PyO3 works I'm making a functional model of the idea that I propose, when I have any progress I tell you guys :)

I think this will work to facilitate working with multithreading while we can't have multiple py interpreters, off course this will not be the fastest thing in the world because of this fact that we only can use one interpreter and not spawn sub interpreters to distribute the work load, but will work, specially for the cases when we only need py for small things inside a parallelized data processing mechanism, I think in these cases it will help a lot

letalboy commented 10 months ago

Hey everyone! 🚀

I've crafted a hands-on demonstration of a system architecture that seamlessly integrates Python functionalities within parallelized Rust code. This approach effectively sidesteps the GIL constraints and the challenges of passing Python objects between threads.

🔗 Dive into the details and check out the comprehensive documentation here: RustPyNet on GitHub.

While it's not a full-fledged multi-compiler system, it does simplify the execution of Python functions in a multi-threaded environment. For me, it's been a game-changer for projects that leverage parallelized Rust processes and use PyO3 just for callbacks. I genuinely believe this isn't just beneficial for my projects, but for many others in our community, who are working on similar projects, could greatly benefit from this integration.

I'm reaching out to see if there's potential to integrate this into the PyO3 project. I'm genuinely curious about your thoughts, especially from our development team members. If there's interest, I'm more than willing to assist in its implementation. Let's discuss and explore its wider potential! 🤔👨‍💻👩‍💻

Aequitosh commented 10 months ago

We are very aware of the per-intepreter parallelism landing in Python 3.12. There are significant changes which need to happen to PyO3's current implementation to support this correctly. We have been discussing some of these challenges in multiple discussions across this repo, such as #2885 which looks at the possible nogil option.

There are several main issues which are prominent in my mind, although others may exist:

I understand interpreters cannot share Python objects. This implies that Py<T> needs to be removed or reworked significantly, maybe by removing Send and Sync from that type, probably also by somehow making the operation to attach Py<T> to a Python thread to be unsafe or runtime-checked in some way.

We need to fully transition PyO3 to PEP 630 compatibility, which requires elimination of all static data which contains Python state. This is probably linked to the first bullet.

APIs like GILOnceCell and GILProtected can no longer be Sync if multiple GILs exist. Transition to PEP 630 compatibility will probably force us to replace these types with alternative solutions.

Solving these problems is likely to create significant churn of PyO3's API, so we can only make progress once someone has proposed a relatively complete solution which we can adopt with a suitable migration path for users.

To get this issue back on topic, I'd be willing to contribute a decent amount in order to allow PyO3 to support sub-interpreters.

We've noticed that some of our users can't use Cepn's Dashboard, which led me down quite a rabbit hole. To keep things short, I eventually stumbled across bazaah/aur-ceph#20, which lists all of the facts. In short, anything that transitively depends on PyO3 will break once sub-interpreters enter the stage, unfortunately.

So... how may I help? What would be the best way to start tackling this?

GoldsteinE commented 10 months ago

I’ve tried playing with this a bit. My first idea was to make the 'py lifetime invariant, so it may serve as a unique identifier of object’s provenance. Unfortunately, this breaks basically everything. I’m not sure whether there is some more lenient approach (maybe two lifetimes? token and the actual covariant lifetime). Either way, it seems like it would be a breaking change with this approach.

davidhewitt commented 10 months ago

@Aequitosh thanks for the offer, it would be great to begin making progress on this. The above comment https://github.com/PyO3/pyo3/issues/576#issuecomment-1574360683 is still a good summary of the state of play.

Are you interested in design work? Implementation? Reviews? How much effort are you prepared to put in? This is going to be a big chunk of work.

I think that a realistic mid-term solution is that:

We get PyO3's internal implementation sound under subinterpreters. This means:
- Rework synchronization primitives to not rely on the GIL. The thread #2885 has a lot of discussion in this area. Ideally we need to come up with a transition plan so that existing users can migrate their code without enormous amounts of work.
- Remove static data from PyO3's implementation. The main use of this is in LazyTypeObject which stores the #[pyclass] types. Several possible places to relocate static data to:
- (Preferred) module-specific state, see PyModule_GetState
- (Alternative) interpreter-specific state, see PyInterpreterState_GetDict
We give extension module authors the responsibility to audit their own code and have an unsafe opt-into allow their module to be used with subinterpreters, e.g. #[pymodule(unsafe_allow_subinterpreters)]. This would basically be their way of saying "we don't store Py<T> in any static data" - we'd document all the conditions their module should satisfy.

In the long term we may be able to remove the need for extension module authors to audit their own code, once we've built up confidence of operation under subinterpreters.

In short, anything that transitively depends on PyO3 will break once sub-interpreters enter the stage, unfortunately.

I disagree slightly with the sentiment of "will break". Many extension modules implemented in C and C++ most likely also do not work correctly with subinterpreters. I read a comment from CPython devs somewhere which suggested they are aware that even if CPython 3.12 or 3.13 ships with complete subinterpreter support the ecosystem is going to need many years to transition.

Regardless I support that we should do what we can to not block users who are pushing to run subinterpreters in their systems. All help with implementation is gladly welcome. I would also be open to considering an environment variable PYO3_UNSAFE_ALLOW_SUBINTERPRETERS=1 which gives end users the opportunity to disable the subinterpreter safety check... at their own responsibility of crashes. Such an opt-out may strike an acceptable balance between Rust's penchant for correctness and Python's mentality that we are all responsible users.

davidhewitt commented 10 months ago

@GoldsteinE that's an interesting idea. Care to explain a little more about the original thesis behind making the lifetime invariant?

(We might also want to split this topic into several sub-issues / discussions with back references to here...)

GoldsteinE commented 10 months ago

@davidhewitt The idea is taken from the GhostCell paper. Basically, the signature of Python::with_gil() has F: for<'py> FnOnce(Python<'py>) -> R in it. If the 'py lifetime is invariant, then the following code

interpreter1.with_gil(|py1| {
    interpreter2.with_gil(|py2| {
        let obj1 = py1.get_some_python_ref(); // has lifetime 'py1
        let obj2 = py2.get_some_python_ref(); // has lifetime 'py2
        obj1.some_method(obj2); // error: lifetimes do not match
    })
})

wouldn’t compile, preventing us from mixing objects from different interpreters (Py<_> pointer would need a runtime tag, since it doesn’t have a lifetime).

My dyngo crate is a practical example of this technique.

davidhewitt commented 10 months ago

Interesting. I can see how that would guarantee provenance statically, but I think it might cause issues with APIs like #[pyfunction] where the exact same code region might be called from multiple different interpreters. My instinct was that we would have to store the interpreter ID inside each Py<T> and only allow attaching to the same interpreter.

Having the Python lifetime be invariant may be a good idea to consider as part of #3382.

GoldsteinE commented 10 months ago

Yes, Py<T> would need to have a runtime tag. I think #[pyfunction] is probably okay, since it would be generic over 'py, which is invariant?

adamreichold commented 10 months ago

Would an invariant lifetime also preclude valid code like

interpreter1.with_gil(|py1| {
    interpreter1.with_gil(|py2| {
        let obj1 = py1.get_some_python_ref(); // has lifetime 'py1
        let obj2 = py2.get_some_python_ref(); // has lifetime 'py2
        obj1.some_method(obj2); // still an error: same interpreters, but the lifetimes are generative and hence unique per closure invocation
    })
})

?

Personally, I think we will need to store interpreter ID within all references into the Python heap. I also think, this will mesh well with our aim to drop bare references (and the pool required to make them work) for other reasons.

adamreichold commented 10 months ago

Regardless I support that we should do what we can to not block users who are pushing to run subinterpreters in their systems. All help with implementation is gladly welcome. I would also be open to considering an environment variable PYO3_UNSAFE_ALLOW_SUBINTERPRETERS=1 which gives end users the opportunity to disable the subinterpreter safety check... at their own responsibility of crashes. Such an opt-out may strike an acceptable balance between Rust's penchant for correctness and Python's mentality that we are all responsible users.

For the same reasons as in the discussion of nogil support and whether PanicException should derive BaseException or not, I am somewhat sceptical about adding user-controlled I-dont-care-just-make-it-go-fast-and-you-can-bet-I-am-also-enabling-ffast-math-everywhere kind of flags. I would prefer that this requires opt-in from the extension authors. Really ~~stubborn~~ adventurous users can always modify the code to perform that opt-in themselves.

GoldsteinE commented 10 months ago

@adamreichold Yes, I assumed that ::with_gil() is not reentrant, but it apparently is. I’m not sure what’s the usecase of it though: if you already have obj1, you could just write

let py = obj1.py();
let obj2 = py.get_some_python_ref();
obj1.method(obj2);

Is there a case where this workaround is too unwieldy to use?

adamreichold commented 10 months ago

Is there a case where this workaround is too unwieldy to use?

Yes, with_gil is often used when the GIL token cannot be threaded through the call chain, e.g. when implementing standard traits like fmt::Display on user types whose implementation still needs access to GIL-protected data.

We actually do the work to detect the case when the GIL is already held without calling into the CPython API to make this rather common case as fast as reasonably possible.

GoldsteinE commented 10 months ago

.with_gil() could still be reentrant, even if objects from different invocations don’t mix. fmt::Display doesn’t accept anything GIL-bound, so it should still work fine, I think

GoldsteinE commented 10 months ago

In general, I feel like there’re two cases:

Either you already have some GIL-bound object, in which case you could just get its Python
Or you don’t, so you’re fine with creating a new scope, since you don’t need to mix your GIL-bound objects with any others

adamreichold commented 10 months ago

fmt::Display doesn’t accept anything GIL-bound, so it should still work fine, I think

I don't understand this part: Why can't these traits not be implemented for GIL-bound types, whether they are bare references or smart pointers?

In general, I feel like there’re two cases:

To me this feels like Go's Context parameter: It is simple (for some definition of simple) and works if you control all the code. But I fear that the ergonomics of forcing people to thread through GIL tokens everywhere the want to match objects are really bad, especially across an ecosystem of PyO3-based libraries.

There is also the additional problem, that interpreter identity might not be known until runtime and I might want to have different behaviour (for example optimizations avoiding communication) if they do match. Of course, I could have two different approaches using either two scopes or just a single one, but it forces the author to handle this at a high level of code organization instead of tucking it away as some implementation detail.

GoldsteinE commented 10 months ago

I don't understand this part: Why can't these traits not be implemented for GIL-bound types, whether they are bare references or smart pointers?

I use the word “GIL-bound” as “has a 'py lifetime” here. If you implement fmt::Display for a GIL-bound type, you are already holding GIL, so you could store an interpreter inside the type (like PyAny::py() does). You don’t need to thread the tokens, if you could get it from an existing object.

GoldsteinE commented 10 months ago

I agree that this approach may harm convenience for some usecases. The only alternative I see is to always perform this check at runtime, which is more convenient to write, but doesn’t catch some errors that could be detected at compile-time. Maybe there’s some hybrid approach?

adamreichold commented 10 months ago

I use the word “GIL-bound” as “has a 'py lifetime” here. If you implement fmt::Display for a GIL-bound type, you are already holding GIL, so you could store an interpreter inside the type (like PyAny::py() does). You don’t need to thread the tokens, if you could get it from an existing object.

Ok, so they do work on GIL-bound types, your point is rather that access to GIL-bound types implies access to a GIL token.

Let's try to construct a more involved example. You want to implement PartialOrd<&'py Foo> for Py<Foo>. To do that, you need to to turn Py<Foo> into &'py Foo with the same lifetime, but you cannot verify the interpreter ID of the right-hand side &'py Foo because the information is erased at runtime. So while you do have access to a GIL token, you not actually need access to the Interpreter which produced the reference to produce another one with guaranteed compatible provenance. (Or alternatively, a non-zero sized GIL token containing the interpreter ID).

When bare references are replaced by by GIL-bound smart pointers like Py<'py, Foo> storing the interpreter ID, they themselves would be sufficient to turn PyDetached<Foo> into Py<'py, Foo> or something like that.

adamreichold commented 10 months ago

Maybe there’s some hybrid approach?

I think the most straight-forward approach would be to do everything at runtime by default and then provide additional API which lift some of those checks into compile time as an optimization. (Similarly to how the ghostcell types are an relatively inconvenient compile time optimization of std's plain Cell.) (So, we'd need three smart pointers, e.g. PyDetached<T>, Py<'py, T> and PyBranded<'py, 'interp, T>.)

adamreichold commented 10 months ago

(So, we'd need three smart pointers, e.g. PyDetached, Py<'py, T> and PyBranded<'py, 'interp, T>.)

Again, this might completely by accident mesh rather well with the Py*Methods traits as introduced in e.g. #3445 which might provide a nice trait bound to abstract over the difference between these types, at least the last two ones.

davidhewitt commented 10 months ago

Heh, quite possibly! If we have to add a new smart pointer type for subinterpreters I wonder whether we could gate it behind a feature, or whether all PyO3 modules would in practice have to use it to be subinterpreter safe.

I just saw this discussion for Cython - looks like they also want it opt-in only: https://github.com/cython/cython/issues/2343

adamreichold commented 10 months ago

whether all PyO3 modules would in practice have to use it to be subinterpreter safe.

My understanding is that all smart pointers would be subinterpreter safe, i.e. contain an interpreter ID. The only difference would be whether it is checked at runtime for each access (Py<'py, t>) or once upfront and reified at compile time (PyBranded<'py, 'interp, T>). (The branded terminology is shamelessly stolen from ghost-cell.)

davidhewitt commented 10 months ago

Ugh I see, I was hoping we could limit the checking just to PyDetached. Given that Cython also doesn't support this yet (and wants it to be opt-in for performance reasons) I have the same worry.

mejrs commented 10 months ago

My concern with checking this at runtime is that it sounds error prone. I'm rather spoiled by Rust's threadsafety being checked at compile time.

Aequitosh commented 10 months ago

@Aequitosh thanks for the offer, it would be great to begin making progress on this. The above comment #576 (comment) is still a good summary of the state of play.

Are you interested in design work? Implementation? Reviews? How much effort are you prepared to put in? This is going to be a big chunk of work.

All of the above, though I'm not sure how fit I'd be for design work, as I'm not too familiar with PyO3's internals yet. I can most definitely tackle implementation work and provide second opinions in reviews.

Regarding effort (I'll rephrase this as time here): I can probably spare anywhere between 6-18 hours a week. This will vary from time to time unfortunately, as I'll have a somewhat tight schedule soon again, but I nevertheless want to make time for this (as I'm both an absolute Python and Rust nerd :wink:).

I disagree slightly with the sentiment of "will break". Many extension modules implemented in C and C++ most likely also do not work correctly with subinterpreters. I read a comment from CPython devs somewhere which suggested they are aware that even if CPython 3.12 or 3.13 ships with complete subinterpreter support the ecosystem is going to need many years to transition.

Very fair point! I feel like I misphrased my point a little bit here; I think it's rather just very unexpected that an ImportError is raised in such a case. Somewhat off-topic, but how would someone using a library (e.g. cryptography) handle this on their side anyway? Only allow the module using PyO3 be initialized once in a "reserved" subinterpreter that the others communicate with? Maybe I'm asking a bit naively here.

Also, to speculate here a little bit: My gut tells me that it would be beneficial for both the Python ecosystem overall and the PyO3 project itself if subinterpreters are supported by PyO3 sooner than later. If, theoretically, PyO3 would provide ~~a not painful~~ an easy way for extension authors to support subinterpreters, this would make it more convenient for the whole ecosystem to transition and in turn increase the adoption of PyO3. I could be mistaken; just throwing that out there.

Regardless I support that we should do what we can to not block users who are pushing to run subinterpreters in their systems. All help with implementation is gladly welcome. I would also be open to considering an environment variable PYO3_UNSAFE_ALLOW_SUBINTERPRETERS=1 which gives end users the opportunity to disable the subinterpreter safety check... at their own responsibility of crashes. Such an opt-out may strike an acceptable balance between Rust's penchant for correctness and Python's mentality that we are all responsible users.

I also agree with @adamreichold here; this is more something that should be controlled by extension authors.

Regarding the mid- and long-term solutions you mentioned: I can't really provide my perspective on this (yet), but I'll stick with this for now. I'll probably create a fork sooner or later and mess around with PyO3's internals myself. The GhostCell idea mentioned by @GoldsteinE sounds also very interesting. There have been lots of good points made here, so I'll try my best to collect them somewhere.

I'll see what I can cook up and will probably open a tracking issue somewhat soon, if that's alright.

letalboy commented 10 months ago

Upon conducting a detailed investigation into PyO3, I've noticed that the ffi module, which serves as the primary bridge for communication with Python, seems to lack implementations for Py_NewInterpreter and Py_EndInterpreter. To my knowledge, these functions have been available in the CPython API since Python 3.10. I suggest that addressing this gap should be our first step before exploring the potential of subinterpreters.

Furthermore, I recommend that we start by implementing the Send and Sync traits, as previously discussed by the crate development and maintenance team. This approach could potentially lead to the establishment of a global GIL pool, allowing multiple threads to effortlessly access a Python instance. This notion is consistent with the example I shared in our prior conversations here on this topic. Successfully implementing this idea could simplify the integration of subinterpreters in the future, especially if the primary controller for subinterpreters becomes readily available.

I recognize that the task at hand is complex. Several modules within the crate might need alterations, and the intricacies of this endeavor are considerable. Nonetheless, I'm enthusiastic about contributing. I'm currently delving deeper into the subinterpreters API to gain a better understanding, and I'm optimistic that I can assist in some capacity. I welcome suggestions on areas to focus on to further this initiative, and I'm hopeful that we can successfully integrate this feature into PyO3. Achieving this would represent a significant milestone, as it would provide a mechanism to utilize Python across multiple Rust threads in a fully memory-safe manner.

letalboy commented 10 months ago

@GoldsteinE that's an interesting idea. Care to explain a little more about the original thesis behind making the lifetime invariant?

(We might also want to split this topic into several sub-issues / discussions with back references to here...)

I concur that subdividing this topic here might be beneficial for a more organized and in-depth discussion. There seems to be a plethora of ideas branching out from this, each with its own set of complexities and potential.

I believe that by breaking down the topic, we can foster a more structured dialogue and ensure that all aspects are thoroughly explored. I'm excited about the potential this holds and look forward to the ensuing discussions!

davidhewitt commented 10 months ago

ll see what I can cook up and will probably open a tracking issue somewhat soon, if that's alright.

Thanks @Aequitosh, will await further thoughts 👍

Somewhat off-topic, but how would someone using a library (e.g. cryptography) handle this on their side anyway? Only allow the module using PyO3 be initialized once in a "reserved" subinterpreter that the others communicate with? Maybe I'm asking a bit naively here.

Yes, it's extremely awkward for users to handle if they do want to use subinterpreters. It's unfortunately just necessary for safety.

lack implementations for Py_NewInterpreter and Py_EndInterpreter

@letalboy would you be willing to open a PR to add these please? After that, a suggested first step is we need to understand what the replacement for GILOnceCell looks like with subinterpreters.

davidhewitt commented 10 months ago

@letalboy sorry to not reply sooner regarding RustPyNet. I think that's a great example of how to use multiple Rust threads with a single Python thread to put workloads in the right place. I'm sure it would make a useful crate for folks facing similar problems if you wanted to publish it. I'm not personally convinced it's necessary to add such a construct to the main PyO3 crate quite yet. There is also the future possibility of nogil Python which would replace the need for the worker thread model.

letalboy commented 10 months ago

@letalboy would you be willing to open a PR to add these please? After that, a suggested first step is we need to understand what the replacement for GILOnceCell looks like with subinterpreters.

@davidhewitt You mean do some like this:

pub fn new_interpreter() -> Result<*mut PyThreadState, SomeErrorType> {
    let state = unsafe { Py_NewInterpreter() };
    if state.is_null() {
        // Handle error, perhaps fetch the Python exception, etc.
        return Err(SomeErrorType);
    }
    Ok(state)
}

Around the ffi pylifecycle.rs then implement a safety lifetime mechanism around it?

I think I can do that, I only need to know where is the best place to add it and if I'm looking in the correct place

letalboy commented 10 months ago

@letalboy sorry to not reply sooner regarding RustPyNet. I think that's a great example of how to use multiple Rust threads with a single Python thread to put workloads in the right place. I'm sure it would make a useful crate for folks facing similar problems if you wanted to publish it. I'm not personally convinced it's necessary to add such a construct to the main PyO3 crate quite yet. There is also the future possibility of nogil Python which would replace the need for the worker thread model.

No problem about the delay in response, I know that are a lot of things going on and that is a hard task to maintain large projects with this amount of mechanisms involved, I have some private ones that I maintain to some companies that are quite complex too, and I know that are a lot of things to handle rs.

The RustPyNet crate that I uploaded in my profile are just and concept of something that I build for my self, the mention that I made for it are just an idea of a possible implementation of the mechanism to get around of having to send python objects and instead execute by sending the entire function code to a place that will call python interpreter in centralized way getting around the problem of have to call multiple ones, but since we are working to implement multiple sub interpreters now doesn't need necessarily to be that way, however i think have some ideas that we can base on this concept to facilitate the sub interpreter management like you guys are saying above that will be difficult to average users. Thanks for the suggestion, I will see if I can improve and then publish it, but i hope that in future we don't need it and have a fully functional sub-intepreters mechanism that will be the ideal scenario. Also, if you want to base in something of the crate for some sort of centralized controller ref to the sub interpreters head fell free to use it! :)

davidhewitt commented 9 months ago

You mean do some like this:

I meant just the FFI definitions in pyo3-ffi for now. We know enough of PyO3 doesn't work with subinterpreters that there's no point adding a safe way to create a subinterpreter yet in my opinion. That's just misleading to users.

letalboy commented 9 months ago

Yes, this makes sense! So in what branch I can make this change in ffi?

Also, I've been considering the best approach for integrating the subinterpreter support since we are start to implement the features needed for it. Given the scale of the changes we're anticipating, would it be a good idea to create a dedicated branch derived from main? I'm thinking of naming it subinterpreter-support or something like this. This would allow us to merge PRs related to this feature in an isolated environment, streamlining the testing and review processes.

I'd appreciate any feedback or suggestions on this approach.

Aequitosh commented 9 months ago

Yes, this makes sense! So in what branch I can make this change in ffi?

Also, I've been considering the best approach for integrating the subinterpreter support since we are start to implement the features needed for it. Given the scale of the changes we're anticipating, would it be a good idea to create a dedicated branch derived from main? I'm thinking of naming it subinterpreter-support or something like this. This would allow us to merge PRs related to this feature in an isolated environment, streamlining the testing and review processes.

I'd appreciate any feedback or suggestions on this approach.

If you'd like to, we can do this over at my fork. I haven't really gotten properly started on it, but over there we could manage things on our own instead of opening branches here in the upstream repository.

Just let me know and I'll add you.

davidhewitt commented 9 months ago

So in what branch I can make this change in ffi?

Just create a fork and open a PR.

For the wider changes, I think also best to experiment in forks and open PRs with reviewable pieces when we are happy with various ideas.

Aequitosh commented 9 months ago

I opened a tracking issue regarding this: https://github.com/PyO3/pyo3/issues/3451

Though, to keep things tidy, I opened a discussion over at my fork for everybody that wishes to get involved and contribute: https://github.com/Aequitosh/pyo3/discussions/1

I'll handle pretty much most things over at my fork, e.g. post my thoughts, initial ideas, plans, etc. over there. I will open PRs when necessary - since this is probably going to be quite the endeavour, I expect that it will be split up in lots of smaller PRs in order to make reviewing (and contributing) easier.

letalboy commented 9 months ago

Yes, this makes sense! So in what branch I can make this change in ffi? Also, I've been considering the best approach for integrating the subinterpreter support since we are start to implement the features needed for it. Given the scale of the changes we're anticipating, would it be a good idea to create a dedicated branch derived from main? I'm thinking of naming it subinterpreter-support or something like this. This would allow us to merge PRs related to this feature in an isolated environment, streamlining the testing and review processes. I'd appreciate any feedback or suggestions on this approach.

If you'd like to, we can do this over at my fork. I haven't really gotten properly started on it, but over there we could manage things on our own instead of opening branches here in the upstream repository.

Just let me know and I'll add you.

Yeah, seems to be a good idea, if you want you can add me!

PyO3 / pyo3

Support sub-interpreters #576