python / cpython

The Python programming language
https://www.python.org
Other
63.53k stars 30.43k forks source link

Custom environments in subinterpreters #126977

Open FFY00 opened 3 hours ago

FFY00 commented 3 hours ago

Feature or enhancement

Proposal:

I wanted to explore the viability of having custom environments in subinterpreters. There are several use-cases that could be enabled by this feature.

So far, from informal discussion with others about this, there are a couple possible issues to take into consideration.

Issues

1) Some of the immortal objects shared between subinterpreter may be environment-dependent (pointed out by @Yhg1s) 2) Complications around dynamic loading, by having extension modules from different environments 2.1) Symbol conflicts from their dependencies (pointed out by @Yhg1s) 2.2) Since subinterpreters share the same process, when loading the same shared object, they get the same pointer (pointed out by @pablogsal)

Implementation

The main thing we need is a way to disable the site initialization, which could be a enable_site option in the interpreter config. This should disable the environment customizations, and result in a bare environment without anything extra sys.path.

However, to make the use of different environments more ergonomic, we could add an environment_path location pointing to a directory containing a pyvenv.cfg, which would perform the site initialization for that environment.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

FFY00 commented 3 hours ago

1)

I'd say this is probably the main issue with this proposal, but I don't know the specifics.

2)

This is already an issue right now, but is exacerbated here, as it increases the likeliness of users running into it. We should consider possible preventive or mitigation measures.

Regarding 2.1), most modern Linux environments don't hit 2.1), as loaded symbols aren't loaded globally, unless RTLD_GLOBAL is used, but it is still an issue on a bunch of other systems, so it's still pretty relevant. (thanks @pablogsal) A possible mitigation measure might be to preemptively detect symbol clashes and raise an ImportError when loading extensions that would hit it, but I am not sure about it's viability.

Regarding 2.2), AFAICT, this means that global data in the extension and its dependencies is shared between subinterpreters. Similarly, we could possibly mitigate this by detecting it and raising ImportError.

If these, or any other aspects of 2), are still problematic, we could simply prevent loading extension modules on subinterpreters that have a custom environment.

gpshead commented 10 minutes ago

First reaction: I'm skeptical that we actually want this as stated? subinterpreters having different environment configs than the main interpreter doesn't feel right. Would we want to support that explicitly as a feature for everyone to build on and depend on?

a way to disable the site initialization, which could be a enable_site option in the interpreter config. This should disable the environment customizations, and result in a bare environment without anything extra sys.path.

This is a much more direct thing to ask for and could be implemented as a feature on its own without allowing arbitrary whole new environment configs. Gut feeling: whole new configs contain a can of worms of potentially unintended consequences. I expect Eric and others with their head in (sub)interpreter startup land to have a better feel for the reality of my gut check here.