Open anilbey opened 3 months ago
cc @ilkilic @darshanmandge @stefanoantonel
@anilbey : could you check this https://github.com/neuronsimulator/nrn/issues/2094#issuecomment-1322249684 ? Just from the title, without looking into details, I wonder if it's multiprocessing module itself and not neuron.
Yes it is the same issue. Thanks for pointing to it. It is indeed the difference of the multiprocessing module between the platforms.
However, this subtle but critical implementation difference in the behavior is very difficult for the users of NEURON to spot. When not considered, the simulation results become corrupted. Can we not make a decision between the fork mode
/ spawn mode
at the NEURON level and hide this detail (and responsibility) from the Python users?
thanks for confirming, @anilbey!
As neuron is loaded into Python as a C/C++ library and has a global state, I don't have clarity into how such a thing could be ("easily") achieved on the NEURON side.
Tagging @ferdonline and @1uc if they have any thoughts/ideas.
I don't immediately see a quick fix. In a library, e.g. bluepyopt, you can use get_context
:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
(half way down the section).
Likely something close to:
import multiprocessing
# ...
ctx = multiprocessing.get_context("fork")
with ctx.Pool(5) as p:
p.map(f, range(5))
Thanks for the information.
And also the maxtasksperchild=1,
argument has to be set in the pool running NEURON to make sure one run does not cause side-effect on the others.
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool
I guess there is no (easy) change to do on the neuron side. So I will close this ticket.
I'm reopening because I see two easy ways of addressing this:
Option 1: We could make sure our documentation points to the multiprocessing documentation section about this and just tell everyone to always call mp.set_start_method... and we could make sure all of our examples do so (currently: they do not).
Option 2: NEURON could call mp.set_start_method('spawn') on import.
I think option 2 is a bad idea because (1) it changes the behavior of existing code and (2) I don't think we should be in the business of changing default behaviors in other packages. That's likely to lead to nasty surprises for uses and confusion whenever someone looks at the documentation. (It would have to be "spawn"
instead of "fork"
because only spawn is supported on all platforms.)
Agree, Option 1 seems reasonable.
I don't see examples of multiprocessing
in our docs (I think). Where should this go?
Context
Overview of the issue
Neuron global parameters are passed to sub processes on Ubuntu. Neuron global parameters are not passed to sub processes on MacOS.
Expected result/behavior
The behavior should not differ between operating systems.
NEURON setup
NEURON 8.2.4
Minimal working example - MWE
MWE that can be used for reproducing the issue and testing. A couple of examples:
def f(_): print(f" inside the process: {neuron.h.celsius}") print(f" process id: {os.getpid()}")
if name == 'main': print("the main process") print(neuron.h.celsius) neuron.h.celsius = 37.0 print(neuron.h.celsius)
❯ python reproducer.py the main process 6.3 37.0 celsius is: 37.0 process id: 22858 celsius is: 37.0 process id: 22859 celsius is: 37.0 process id: 22860 celsius is: 37.0 process id: 22861 celsius is: 37.0 process id: 22862