enso-org / enso

Hybrid visual and textual functional programming.
https://ensoanalytics.com
Apache License 2.0
7.36k stars 324 forks source link

Parser crashing in native code due to multi-threaded access #11121

Closed hubertp closed 3 weeks ago

hubertp commented 1 month ago
C  [libenso_parser.dylib+0x2e81c]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h2955aec13fd254fd+0x90
C  [libenso_parser.dylib+0x23e98]  Java_org_enso_syntax2_Parser_getUuidHigh+0x2c
j  org.enso.syntax2.Parser.getUuidHigh(JJJ)J+0 org.enso.syntax
j  org.enso.syntax2.Message.getUuid(JJ)Ljava/util/UUID;+6 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+940 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+1902 org.enso.syntax

or

Stack: [0x0000000287a38000,0x0000000287c3b000],  sp=0x0000000287c38920,  free space=2050k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libenso_parser.dylib+0x2e81c]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h2955aec13fd254fd+0x90
C  [libenso_parser.dylib+0x23e98]  Java_org_enso_syntax2_Parser_getUuidHigh+0x2c
j  org.enso.syntax2.Parser.getUuidHigh(JJJ)J+0 org.enso.syntax
j  org.enso.syntax2.Message.getUuid(JJ)Ljava/util/UUID;+6 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+940 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+1902 org.enso.syntax
j  org.enso.syntax2.MultiSegmentAppSegment.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/MultiSegmentAppSegment;+17 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+3350 org.enso.syntax
j  org.enso.syntax2.Line.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Line;+17 org.enso.syntax

hs_err_pid90180.log hs_err_pid90473.log

hubertp commented 1 month ago

Encountered another one:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007d20e0463f1a, pid=849536, tid=849636
#
# JRE version: OpenJDK Runtime Environment GraalVM CE 21.0.2+13.1 (21.0.2+13) (build 21.0.2+13-jvmci-23.1-b30)
# Java VM: OpenJDK 64-Bit Server VM GraalVM CE 21.0.2+13.1 (21.0.2+13-jvmci-23.1-b30, mixed mode, sharing, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libenso_parser.so+0xaef1a]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h40e1af357c1cc3f3+0xaa
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/hubert/Documents/enso-projects/core.849536)
#
# An error report file with more information is saved as:
# /home/hubert/Documents/enso-projects/hs_err_pid849536.log
[10.884s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
#   https://github.com/oracle/graal/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

hs_err_pid849536.log

The project has worked a couple of minutes ago with no problems before being re-opened and crashing.

kazcw commented 1 month ago

I have reproduced this; it is the same issue as #11104.

kazcw commented 1 month ago

Adding this debugging code: https://github.com/enso-org/enso/pull/11137/commits/c912a55182da62b2172d9378fc8f288e6ae8d20e

I found this result:

[ERROR] [2024-09-19T17:45:12.839] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.842] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 72abc08d-c464-4ce2-853e-9f96c00ca36b failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.850] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.857] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 54541b7d-f9fc-4ebb-81f0-907f3d1a634a failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.860] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.860] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression b9f7fc7d-aee2-4f31-be3a-7961dbba800c failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.878] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.878] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 54541b7d-f9fc-4ebb-81f0-907f3d1a634a failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.880] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.880] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 72abc08d-c464-4ce2-853e-9f96c00ca36b failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.882] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.882] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression d17307e7-8b24-4e4f-8f12-76eca7080241 failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.890] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.890] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression b9f7fc7d-aee2-4f31-be3a-7961dbba800c failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.892] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.892] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression d17307e7-8b24-4e4f-8f12-76eca7080241 failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression c64f449c-84b2-4f24-9e4f-4865bb1bd7bd failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression c64f449c-84b2-4f24-9e4f-4865bb1bd7bd failed: Race condition detected (evaluation result: None)

It seems the parser state is being concurrently modified by multiple threads. This is not supported. A parser can be moved between threads (with appropriate locks/fencing), or different threads can have their own parsers, but one parser instance must not be concurrently used by multiple threads.

enso-bot[bot] commented 1 month ago

Keziah Wesley reports a new STANDUP for today (2024-09-19):

Progress: Investigated parser problems, traced to unsupported sharing of parser instance between threads. It should be finished by 2024-09-26.

Next Day: Next day I will be working on the #11121 task. Next task.

kazcw commented 1 month ago

After a review of parser usage in the backend, I've decided to simplify the parser API to make this kind of bug impossible.

JaroslavTulach commented 1 month ago

It is very interesting result, @kazcw! So far I've been convinced that our execution is only single threaded. We know we want to move towards multi-threaded one - as such fixing the parsing to support multiple threads is desirable. But it is still surprising.

Adding this debugging code: c912a55

I'd be interested in knowing the stack traces of callers when the collision in the critical section happens. Possibly this small modification of your code could give us traces of the first two threads that collide.

diff --git lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
index 2c375ee840..12290a230b 100644
--- lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
+++ lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
@@ -5,8 +5,12 @@ import java.net.URISyntaxException;
 import java.nio.ByteBuffer;
 import java.nio.ByteOrder;
 import java.nio.charset.StandardCharsets;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.function.Supplier;

 public final class Parser implements AutoCloseable {
+  private final AtomicInteger mutators = new AtomicInteger(0);
+
   private static void initializeLibraries() {
     try {
       System.loadLibrary("enso_parser");
@@ -116,22 +120,39 @@ public final class Parser implements AutoCloseable {
   }

   public ByteBuffer parseInputLazy(CharSequence input) {
-    byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
-    ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
-    inputBuf.put(inputBytes);
-    return parseTreeLazy(state, inputBuf);
+    return criticalSection(
+        () -> {
+          byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
+          ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
+          inputBuf.put(inputBytes);
+          return parseTreeLazy(state, inputBuf);
+        });
   }

   public Tree parse(CharSequence input) {
-    byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
-    ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
-    inputBuf.put(inputBytes);
-    var serializedTree = parseTree(state, inputBuf);
-    var base = getLastInputBase(state);
-    var metadata = getMetadata(state);
-    serializedTree.order(ByteOrder.LITTLE_ENDIAN);
-    var message = new Message(serializedTree, input, base, metadata);
-    return Tree.deserialize(message);
+    return criticalSection(
+        () -> {
+          byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
+          ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
+          inputBuf.put(inputBytes);
+          var serializedTree = parseTree(state, inputBuf);
+          var base = getLastInputBase(state);
+          var metadata = getMetadata(state);
+          serializedTree.order(ByteOrder.LITTLE_ENDIAN);
+          var message = new Message(serializedTree, input, base, metadata);
+          return Tree.deserialize(message);
+        });
+  }
+
+  private <R> R criticalSection(Supplier<R> action) {
+    if (mutators.getAndIncrement() != 0) {
+      throw new IllegalStateException("Race condition detected. On enter.");
+    }
+    var r = action.get();
+    if (mutators.getAndDecrement() != 1) {
+      throw new IllegalStateException("Race condition detected. On exit.");
+    }
+    return r;
   }

   public static String getWarningMessage(Warning warning) {

then there is going to be a lot of warnings, as the counters will be off. Or maybe new IllegalStateException("...").printStackTrace() to make sure the stacktrace is visible and the counter gets back to 0 unless there is an error.

kazcw commented 1 month ago

@JaroslavTulach Added more detailed diagnostics based on your suggestion, and I found this thread conflict:

Thread 0: Thread[#74,job-pool-3,5,main]
org.enso.syntax/org.enso.syntax2.Parser.parse(Parser.java:173)
org.enso.runtime/org.enso.compiler.core.EnsoParser.parse(EnsoParser.java:38)
org.enso.runtime/org.enso.compiler.Compiler.uncachedParseModule(Compiler.scala:602)
org.enso.runtime/org.enso.compiler.Compiler.parseModule(Compiler.scala:568)
org.enso.runtime/org.enso.compiler.Compiler.$anonfun$runCompilerPipeline$1(Compiler.scala:247)
org.enso.runtime/org.enso.compiler.Compiler.$anonfun$runCompilerPipeline$1$adapted(Compiler.scala:245)
org.enso.runtime/scala.collection.immutable.List.foreach(List.scala:333)
org.enso.runtime/org.enso.compiler.Compiler.runCompilerPipeline(Compiler.scala:245)
org.enso.runtime/org.enso.compiler.Compiler.go$1(Compiler.scala:229)
org.enso.runtime/org.enso.compiler.Compiler.runInternal(Compiler.scala:236)
org.enso.runtime/org.enso.compiler.Compiler.run(Compiler.scala:127)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.compile(EnsureCompiledJob.scala:308)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$ensureCompiledModule$1(EnsureCompiledJob.scala:118)
org.enso.runtime/scala.Option.map(Option.scala:242)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.ensureCompiledModule(EnsureCompiledJob.scala:117)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$ensureCompiledFiles$2(EnsureCompiledJob.scala:88)
org.enso.runtime/scala.collection.immutable.List.map(List.scala:246)
org.enso.runtime/scala.collection.immutable.List.map(List.scala:79)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.ensureCompiledFiles(EnsureCompiledJob.scala:88)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$run$1(EnsureCompiledJob.scala:68)
org.enso.runtime/org.enso.interpreter.instrument.execution.ReentrantLocking.withWriteCompilationLock(ReentrantLocking.scala:93)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.run(EnsureCompiledJob.scala:64)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.run(EnsureCompiledJob.scala:49)
org.enso.runtime/org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$runInternal$1(JobExecutionEngine.scala:138)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:106)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:140)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:131)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:102)

Thread 1: Thread[#75,prioritized-job-pool-1,5,main]
org.enso.syntax/org.enso.syntax2.Parser.parse(Parser.java:173)
org.enso.runtime/org.enso.compiler.core.EnsoParser.parse(EnsoParser.java:38)
org.enso.runtime/org.enso.compiler.Compiler.runInline(Compiler.scala:688)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNode.parseExpression(EvalNode.java:80)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNodeGen.executeAndSpecialize(EvalNodeGen.java:148)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNodeGen.execute(EvalNodeGen.java:99)
org.enso.runtime/org.enso.interpreter.node.expression.builtin.debug.DebugEvalNode.execute(DebugEvalNode.java:28)
org.enso.runtime/org.enso.interpreter.node.expression.builtin.debug.DebugEvalMethodGen.execute(DebugEvalMethodGen.java:145)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:535)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:94)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNode.callDirect(ExecuteCallNode.java:94)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeAndSpecialize(ExecuteCallNodeGen.java:171)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeCall(ExecuteCallNodeGen.java:101)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode$RepeatedCallNode.executeRepeating(LoopingCallOptimiserNode.java:270)
org.graalvm.truffle/com.oracle.truffle.api.nodes.RepeatingNode.executeRepeatingWithValue(RepeatingNode.java:112)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedOSRLoopNode.profilingLoop(OptimizedOSRLoopNode.java:169)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedOSRLoopNode.execute(OptimizedOSRLoopNode.java:120)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode.dispatch(LoopingCallOptimiserNode.java:95)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode.cachedDispatch(LoopingCallOptimiserNode.java:69)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNodeGen.executeAndSpecialize(LoopingCallOptimiserNodeGen.java:153)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNodeGen.executeDispatch(LoopingCallOptimiserNodeGen.java:130)
org.enso.runtime/org.enso.interpreter.runtime.Module$InvokeMember.evalExpression(Module.java:662)
org.enso.runtime/org.enso.interpreter.runtime.Module$InvokeMember.doInvoke(Module.java:723)
org.enso.runtime/org.enso.interpreter.runtime.ModuleGen$InteropLibraryExports$Cached.executeAndSpecialize(ModuleGen.java:115)
org.enso.runtime/org.enso.interpreter.runtime.ModuleGen$InteropLibraryExports$Cached.invokeMember(ModuleGen.java:104)
org.graalvm.truffle/com.oracle.truffle.api.interop.InteropLibraryGen$CachedDispatch.invokeMember(InteropLibraryGen.java:8549)
org.enso.runtime/org.enso.interpreter.service.ExecutionService$InvokeMemberRootNode.execute(ExecutionService.java:608)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.enso.runtime/org.enso.interpreter.service.ExecutionService.evaluateExpression(ExecutionService.java:296)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationFunction$1(UpsertVisualizationJob.scala:368)
org.enso.runtime/scala.util.Try$.apply(Try.scala:210)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.evaluateVisualizationFunction(UpsertVisualizationJob.scala:364)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.evaluateModuleExpression(UpsertVisualizationJob.scala:445)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationExpression$2(UpsertVisualizationJob.scala:473)
org.enso.runtime/scala.util.Either.flatMap(Either.scala:352)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationExpression$1(UpsertVisualizationJob.scala:472)
org.enso.runtime/scala.util.Either.flatMap(Either.scala:352)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.org$enso$interpreter$instrument$job$UpsertVisualizationJob$$evaluateVisualizationExpression(UpsertVisualizationJob.scala:471)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.$anonfun$run$1(UpsertVisualizationJob.scala:70)
org.enso.runtime/org.enso.interpreter.instrument.execution.ReentrantLocking.withContextLock(ReentrantLocking.scala:217)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.run(UpsertVisualizationJob.scala:68)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.run(UpsertVisualizationJob.scala:42)
org.enso.runtime/org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$runInternal$1(JobExecutionEngine.scala:138)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:106)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:140)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:131)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:102)
kazcw commented 1 month ago

It seems visualization expression compilation and module compilation use the same Compiler instance from different threads concurrently; is Compiler and its owned data (besides Parser) designed for this, or could this cause other problems?

JaroslavTulach commented 1 month ago

Added more detailed diagnostics ... and I found this thread conflict:

Amazing! Thanks a lot, @kazcw. So, it is DebugEvalNode... I had recently a conflicting expectations about it too in #11022 ... looks like some of us tend to forget it does compilation and that it does it in middle of execution.

It seems visualization expression compilation and module compilation use the same Compiler instance from different threads concurrently; is Compiler and its owned data (besides Parser) designed for this, or could this cause other problems?

That's a very good question. So far the common expectation among @4e6, @Akirathan, @hubertp was that compilation is single-threaded. Apparently it is not. Yes, it can have consequences.

Compiler references FreshNameSupply

FreshNameSupply contains a var not ready for multi-threaded access. Possible damage is however low in this case - all that's needed is to avoid duplicated newName inside of a single thread and that's (according to my understanding of Java memory model) guaranteed.

Compiler references DefaultPackageRepository

I see attempts to make DefaultPackageRepository multi-threaded ready. Let's assume it is until we notice why it shouldn't be.

EnsoContext & ModuleScope

At the end the DebugNode calls IrToTruffle step and that may mangle with internals of runtime structures like EnsoContext and ModuleScope. There have been some effort to make ModuleScope more mutli-threaded ready (#9914) as it was known to behave badly under multi-threaded access. There are no known problems, but the code remains too convoluted for a review. ensureCompiledModule then interacts with ModuleScope & co. too.

Goal

The goal is to get parallel compilation and execution working. A task to execute visualizations in parallel is pending somewhere. We will need to make ModuleScope more robust to achieve that. Other parts of the Enso Compiler (except the parser - which is being fixed in #11147) seem to be somehow designed for multi-threaded access. We should strive to get the multi-threaded usage working.