ericsnowcurrently / multi-core-python

Enabling CPython multi-core parallelism via subinterpreters.
BSD 3-Clause "New" or "Revised" License
245 stars 6 forks source link

Disallow fork in a subinterpreter. #44

Closed ericsnowcurrently closed 4 years ago

ericsnowcurrently commented 6 years ago

~https://bugs.python.org/issue34651~ - fork() always disallowed in subinterpreters ~https://bugs.python.org/issue37951~ - small exception added for subprocess.Popen() if strict fork+exec (no preexec_fn arg provided) ~https://bugs.python.org/issue38778~ - add a note to the os.fork() docs https://bugs.python.org/issue38816 - add info to the C-API docs

ericsnowcurrently commented 5 years ago

It would probably make sense to make a PyInterpreterState flag that determines if fork is disallowed...

jakirkham commented 5 years ago

Am curious, why don't forking and subinterpreters play nice?

ericsnowcurrently commented 5 years ago

In general, fork is problematic for the CPython runtime. There are system resources (e.g. file descriptors) that have to be managed properly. When you throw subinterpreters into the mix, particularly if they no longer share the GIL, then there's even more complexity in managing those system resources.

Another reason is that the CPython runtime currently makes assumptions about which interpreter is the "main" one, which is the one responsible for a number of things like handling signals and initializing/finalizing the runtime. When the process forks CPython kills off all other threads and interpreters from the one in which fork() was called. If that happens in a subinterpreter then there has to be some way to promote it to become the "main" interpreter, which in an ideal world would tricky and in really would likely be too hard to be worth doing.

And then there are a couple other reasons I can't think of right now. :)

The simplest solution is to just disallow fork in subinterpreters.

jakirkham commented 5 years ago

Yeah that makes sense. Thanks for the context.

It seems useful to track these issues as users may want to mix these. Admittedly they are probably lower priority issues than anything else atm.

ericsnowcurrently commented 5 years ago

Those are pretty far outside the scope of this project, other than as context for this particular issue. However, it may be worth recording that info somewhere it won't get lost.

It would probably make sense to open a separate BPO issue about adding a long comment to the relevant code about the complexities of fork relative to subinterpreters. That might be a better place than the docs, which are more focused on users than on folks working on the runtime code. I could swear there is already something out there. I bet @gpshead would know. :)

gpshead commented 5 years ago

3.8+ have a note in the docs about subprocess preexec_fn being disallowed in subinterpreters as of https://github.com/python/cpython/commit/98d90f745d35d5d07bffcb46788b50e05eea56c6. But I don't recall us adding higher level fork + subinterpreters docs (i'd have to go read things and hunt for it, same as anyone else). It seems worthwhile to mention.

ericsnowcurrently commented 5 years ago

Thanks Greg. FYI, the change to disallow fork in subinterpreters was reverted due to problems with mod_wsgi. I haven't gotten back to sorting that out yet.

phmc commented 4 years ago

I've been looking at this today, resulting in a small docs change: https://bugs.python.org/issue38778

I think this closes the issue off:

One final thing that it would be good to have input on: I've not audited child_exec in any depth to check that there are no potential subinterpreter-pretending-to-be-main issues lurking. In particular are there any potential problems arising from signal handling between fork and exec (perhaps after _Py_RestoreSignals)?

phmc commented 4 years ago

Actually one final thing on your previous comment Eric:

It would probably make sense to make a PyInterpreterState flag that determines if fork is disallowed...

What's your current view on this? I'm not sure abstracting away "is fork allowed" has any functional benefit as I don't see any chance of the meaning changing (i.e. anything other than "am I the main interpreter?"), although it would neaten things up a bit.

ericsnowcurrently commented 4 years ago

Hmm, yep, there is nothing else to do. :) I've also opened https://bugs.python.org/issue38816 about adding more details to the C-API docs.