oracle / graalpython

A Python 3 implementation built on GraalVM
Other
1.2k stars 104 forks source link

multiprocessing.Process hangs #301

Closed oroppas closed 9 months ago

oroppas commented 1 year ago

The following code from #58:

from multiprocessing import Process, Lock

def f(l, i):
    l.acquire()
    try:
        print('hello world', i)
    finally:
        l.release()

if __name__ == '__main__':
    lock = Lock()

    for num in range(10):
        Process(target=f, args=(lock, num)).start()

hangs indefinitely

(graalpy) [ryuta@fedora tmp]$ graalpy ./mprocessing_test.py 
java.lang.IllegalStateException: Adding child context into a closing context.
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotContextImpl.addChildContext(PolyglotContextImpl.java:719)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.EngineAccessor$EngineImpl.createInternalContext(EngineAccessor.java:1042)
    at org.graalvm.truffle/com.oracle.truffle.api.TruffleContext$Builder.build(TruffleContext.java:1164)
    at com.oracle.graal.python.runtime.PythonContext$ChildContextThread.run(PythonContext.java:1161)
    at java.base@17.0.5/java.lang.Thread.run(Thread.java:833)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:100)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:134)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:125)
    at jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:709)
    at jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:632)
    at jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:565)
    at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.invokeCallBoundary(SubstrateOptimizedCallTarget.java:115)
    at com.oracle.svm.truffle.api.SubstrateOptimizedCallTargetInstalledCode.doInvoke(SubstrateOptimizedCallTargetInstalledCode.java:194)
    at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.doInvoke(SubstrateOptimizedCallTarget.java:97)
    at jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:477)
    at jdk.internal.vm.compiler/org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:458)
    at org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:96)
    at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:810)
    at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:211)
Caused by: Attached Guest Language Frames (1)
msimacek commented 9 months ago

Doesn't happen in the current version anymore.

Note the child processes still crash, because the parent exists (and thus closes the lock semaphore) before the child starts up, because you don't wait for the children. That's a race condition in your program and it could happen on CPython too, it is just less likely because they have faster startup.