neuronsimulator / nrn

NEURON Simulator
http://nrn.readthedocs.io
Other
375 stars 114 forks source link

Multiprocessing behavior differs across Ubuntu/MacOS #2842

Open anilbey opened 3 months ago

anilbey commented 3 months ago

Context

Overview of the issue

Neuron global parameters are passed to sub processes on Ubuntu. Neuron global parameters are not passed to sub processes on MacOS.

Expected result/behavior

The behavior should not differ between operating systems.

NEURON setup

NEURON 8.2.4

Minimal working example - MWE

MWE that can be used for reproducing the issue and testing. A couple of examples:

def f(_): print(f" inside the process: {neuron.h.celsius}") print(f" process id: {os.getpid()}")

if name == 'main': print("the main process") print(neuron.h.celsius) neuron.h.celsius = 37.0 print(neuron.h.celsius)

with Pool(5) as p:
    p.map(f, range(5))

Prints the following on Ubuntu

❯ python reproducer.py the main process 6.3 37.0 celsius is: 37.0 process id: 22858 celsius is: 37.0 process id: 22859 celsius is: 37.0 process id: 22860 celsius is: 37.0 process id: 22861 celsius is: 37.0 process id: 22862


Prints this on MacOS

python reproducer.py the main process 6.3 37.0 celsius is: 6.3 process id: 4712 celsius is: 6.3 process id: 4712 celsius is: 6.3 process id: 4712 celsius is: 6.3 process id: 4712 celsius is: 6.3 process id: 4712


The issue applies to all global parameters.
anilbey commented 3 months ago

cc @ilkilic @darshanmandge @stefanoantonel

pramodk commented 2 months ago

@anilbey : could you check this https://github.com/neuronsimulator/nrn/issues/2094#issuecomment-1322249684 ? Just from the title, without looking into details, I wonder if it's multiprocessing module itself and not neuron.

anilbey commented 2 months ago

Yes it is the same issue. Thanks for pointing to it. It is indeed the difference of the multiprocessing module between the platforms.

However, this subtle but critical implementation difference in the behavior is very difficult for the users of NEURON to spot. When not considered, the simulation results become corrupted. Can we not make a decision between the fork mode / spawn mode at the NEURON level and hide this detail (and responsibility) from the Python users?

pramodk commented 2 months ago

thanks for confirming, @anilbey!

As neuron is loaded into Python as a C/C++ library and has a global state, I don't have clarity into how such a thing could be ("easily") achieved on the NEURON side.

Tagging @ferdonline and @1uc if they have any thoughts/ideas.

1uc commented 2 months ago

I don't immediately see a quick fix. In a library, e.g. bluepyopt, you can use get_context: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods (half way down the section).

Likely something close to:

import multiprocessing

# ...

    ctx = multiprocessing.get_context("fork")
    with ctx.Pool(5) as p:
        p.map(f, range(5))
anilbey commented 2 months ago

Thanks for the information.

And also the maxtasksperchild=1, argument has to be set in the pool running NEURON to make sure one run does not cause side-effect on the others. https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool

pramodk commented 1 month ago

I guess there is no (easy) change to do on the neuron side. So I will close this ticket.

ramcdougal commented 1 month ago

I'm reopening because I see two easy ways of addressing this:

Option 1: We could make sure our documentation points to the multiprocessing documentation section about this and just tell everyone to always call mp.set_start_method... and we could make sure all of our examples do so (currently: they do not).

Option 2: NEURON could call mp.set_start_method('spawn') on import.

I think option 2 is a bad idea because (1) it changes the behavior of existing code and (2) I don't think we should be in the business of changing default behaviors in other packages. That's likely to lead to nasty surprises for uses and confusion whenever someone looks at the documentation. (It would have to be "spawn" instead of "fork" because only spawn is supported on all platforms.)

pramodk commented 1 month ago

Agree, Option 1 seems reasonable.

I don't see examples of multiprocessing in our docs (I think). Where should this go?