stevenybw commented 3 years ago

Configuration

Operating system: Ubuntu 16.04.6 LTS
Kernel: 4.4.0-135-generic
UCX: UCX Release v1.9.0 configured with ./contrib/configure-release --with-java
Java: Oracle JDK 11.0.8
Spark: Apache Spark 3.0.1

Spark launch commandline

spark-shell --master yarn --name ExploreSparkUCX --deploy-mode client --num-executors 32 --conf spark.dynamicAllocation.maxExecutors=32 --executor-cores 7 --executor-memory 22g --driver-memory 22g --conf spark.eventLog.enabled='true' --conf spark.eventLog.dir='/user/spark/applicationHistory' --conf spark.serializer='org.apache.spark.serializer.KryoSerializer' --conf spark.driver.extraClassPath='~/Software/ucx-1.9.0-java/lib:~/Software/ucx-1.9.0-java/lib/jucx-1.9.0.jar:~/sparkucx/target/spark-ucx-1.0-for-spark-3.0.jar' --conf spark.executor.extraClassPath='~/Software/ucx-1.9.0-java/lib:~/Software/ucx-1.9.0-java/lib/jucx-1.9.0.jar:~/sparkucx/target/spark-ucx-1.0-for-spark-3.0.jar' --conf spark.shuffle.manager='org.apache.spark.shuffle.UcxShuffleManager' --conf spark.shuffle.sort.io.plugin.class='org.apache.spark.shuffle.compat.spark_3_0.UcxLocalDiskShuffleDataIO'

Scala application:

sc.textFile("Dataset/some-44gb-text-file").flatMap(_.split(' ')).map(x => (x, 1L)).reduceByKey(_+_, 224).count

Phenomena

Of the first stage, with total 448 tasks, 447 tasks have been finished. After that, the Java Runtime is terminated by SIGSEGV as follow:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f551e278b50, pid=3253764, tid=3257270
#
# JRE version: Java(TM) SE Runtime Environment 18.9 (11.0.8+10) (build 11.0.8+10-LTS)
# Java VM: Java HotSpot(TM) 64-Bit Server VM 18.9 (11.0.8+10-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xd06b50][thread 3254166 also had an error]
[thread 3254607 also had an error]
  ResolvedMethodTable::lookup(int, unsigned int, Method*)+0x30
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to ~/core.3253764)
#
# An error report file with more information is saved as:
# ~/hs_err_pid3253764.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

With the hs_err_pid3253764.log:

Current thread (0x00007f51f0095000):  JavaThread "task-result-getter-2" daemon [_thread_in_vm, id=3257270, stack(0x00007f51a5cfb000,0x00007f51a5dfc000)]

Stack: [0x00007f51a5cfb000,0x00007f51a5dfc000],  sp=0x00007f51a5df8fd0,  free space=1015k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xd06b50]  ResolvedMethodTable::lookup(int, unsigned int, Method*)+0x30
V  [libjvm.so+0x891c7d]  java_lang_invoke_ResolvedMethodName::find_resolved_method(methodHandle const&, Thread*)+0x1d
V  [libjvm.so+0xaacdec]  CallInfo::set_resolved_method_name(Thread*)+0x6c
V  [libjvm.so+0xbe4062]  MethodHandles::resolve_MemberName(Handle, Klass*, bool, Thread*)+0x802
V  [libjvm.so+0xbe41ea]  MHN_resolve_Mem+0x12a
J 772  java.lang.invoke.MethodHandleNatives.resolve(Ljava/lang/invoke/MemberName;Ljava/lang/Class;Z)Ljava/lang/invoke/MemberName; java.base@11.0.8 (0 bytes) @ 0x00007f5502b961af [0x00007f5502b960c0+0x00000000000000ef]
J 9382 c1 java.lang.invoke.MemberName$Factory.resolve(BLjava/lang/invoke/MemberName;Ljava/lang/Class;Z)Ljava/lang/invoke/MemberName; java.base@11.0.8 (157 bytes) @ 0x00007f54fbb5bab4 [0x00007f54fbb5b8c0+0x00000000000001f4]
J 16342 c1 java.lang.invoke.MemberName$Factory.resolveOrFail(BLjava/lang/invoke/MemberName;Ljava/lang/Class;Ljava/lang/Class;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (53 bytes) @ 0x00007f54fc4efe9c [0x00007f54fc4efe20+0x000000000000007c]
J 2735 c1 java.lang.invoke.MethodHandles$Lookup.resolveOrFail(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (48 bytes) @ 0x00007f54fbcffab4 [0x00007f54fbcff6e0+0x00000000000003d4]
J 2058 c1 java.lang.invoke.InnerClassLambdaMetafactory.buildCallSite()Ljava/lang/invoke/CallSite; java.base@11.0.8 (168 bytes) @ 0x00007f54fbba916c [0x00007f54fbba82e0+0x0000000000000e8c]
J 2357 c1 java.lang.invoke.LambdaMetafactory.altMetafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite; java.base@11.0.8 (287 bytes) @ 0x00007f54fbc43314 [0x00007f54fbc41f60+0x00000000000013b4]
J 15481 c2 java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@11.0.8 (20 bytes) @ 0x00007f550334bcd8 [0x00007f550334bca0+0x0000000000000038]
J 2356 c1 java.lang.invoke.DelegatingMethodHandle$Holder.delegate(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@11.0.8 (23 bytes) @ 0x00007f54fbc41644 [0x00007f54fbc411e0+0x0000000000000464]
J 1876 c1 java.lang.invoke.BootstrapMethodInvoker.invoke(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/Object; java.base@11.0.8 (688 bytes) @ 0x00007f54fbb3d58c [0x00007f54fbb3a020+0x000000000000356c]
J 1875 c1 java.lang.invoke.CallSite.makeSite(Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/invoke/CallSite; java.base@11.0.8 (91 bytes) @ 0x00007f54fbb34244 [0x00007f54fbb341c0+0x0000000000000084]
J 1874 c1 java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (44 bytes) @ 0x00007f54fbb2fa2c [0x00007f54fbb2f9c0+0x000000000000006c]
J 1873 c1 java.lang.invoke.MethodHandleNatives.linkCallSite(Ljava/lang/Object;ILjava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (66 bytes) @ 0x00007f54fbb2f454 [0x00007f54fbb2f000+0x0000000000000454]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x889559]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x3b9
V  [libjvm.so+0x888285]  JavaCalls::call_static(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x115
V  [libjvm.so+0xdd1009]  SystemDictionary::find_dynamic_call_site_invoker(Klass*, int, Handle, Symbol*, Symbol*, Handle*, Handle*, Thread*)+0x459
V  [libjvm.so+0xab507f]  LinkResolver::resolve_dynamic_call(CallInfo&, int, Handle, Symbol*, Symbol*, Klass*, Thread*)+0x4f
V  [libjvm.so+0xab5434]  LinkResolver::resolve_invokedynamic(CallInfo&, constantPoolHandle const&, int, Thread*)+0x2c4
V  [libjvm.so+0xab93d6]  LinkResolver::resolve_invoke(CallInfo&, Handle, constantPoolHandle const&, int, Bytecodes::Code, Thread*)+0x3c6
V  [libjvm.so+0x87f698]  InterpreterRuntime::resolve_invokedynamic(JavaThread*)+0x168
V  [libjvm.so+0x87f9dd]  InterpreterRuntime::resolve_from_cache(JavaThread*, Bytecodes::Code)+0x15d
j  org.apache.spark.scheduler.TaskSetManager.maybeFinishTaskSet()V+39
j  org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(JLorg/apache/spark/scheduler/DirectTaskResult;)V+341
J 19438 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$1(Lorg/apache/spark/scheduler/TaskResultGetter$$anon$3;Ljava/lang/Object;)V (810 bytes) @ 0x00007f54fdf1bd9c [0x00007f54fdf17820+0x000000000000457c]
J 19437 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3$$Lambda$2884.apply$mcV$sp()V (12 bytes) @ 0x00007f54fdf018c4 [0x00007f54fdf01840+0x0000000000000084]
J 18878 c2 scala.runtime.java8.JFunction0$mcV$sp.apply()Ljava/lang/Object; (10 bytes) @ 0x00007f550355e5dc [0x00007f550355e5a0+0x000000000000003c]
J 19257 c1 org.apache.spark.util.Utils$.logUncaughtExceptions(Lscala/Function0;)Ljava/lang/Object; (66 bytes) @ 0x00007f54fde87ab4 [0x00007f54fde879a0+0x0000000000000114]
J 19290 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3.run()V (47 bytes) @ 0x00007f54fde961ac [0x00007f54fde95d40+0x000000000000046c]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@11.0.8
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@11.0.8
j  java.lang.Thread.run()V+11 java.base@11.0.8
v  ~StubRoutines::call_stub
V  [libjvm.so+0x889559]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x3b9
V  [libjvm.so+0x88750d]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x1ed
V  [libjvm.so+0x9335ec]  thread_entry(JavaThread*, Thread*)+0x6c
V  [libjvm.so+0xe0f0aa]  JavaThread::thread_main_inner()+0x1fa
V  [libjvm.so+0xe0f411]  JavaThread::run()+0x351
V  [libjvm.so+0xe0acaa]  Thread::call_run()+0x13a
V  [libjvm.so+0xc5293e]  thread_native_entry(Thread*)+0xee

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 772  java.lang.invoke.MethodHandleNatives.resolve(Ljava/lang/invoke/MemberName;Ljava/lang/Class;Z)Ljava/lang/invoke/MemberName; java.base@11.0.8 (0 bytes) @ 0x00007f5502b96136 [0x00007f5502b960c0+0x0000000000000076]
J 9382 c1 java.lang.invoke.MemberName$Factory.resolve(BLjava/lang/invoke/MemberName;Ljava/lang/Class;Z)Ljava/lang/invoke/MemberName; java.base@11.0.8 (157 bytes) @ 0x00007f54fbb5bab4 [0x00007f54fbb5b8c0+0x00000000000001f4]
J 16342 c1 java.lang.invoke.MemberName$Factory.resolveOrFail(BLjava/lang/invoke/MemberName;Ljava/lang/Class;Ljava/lang/Class;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (53 bytes) @ 0x00007f54fc4efe9c [0x00007f54fc4efe20+0x000000000000007c]
J 2735 c1 java.lang.invoke.MethodHandles$Lookup.resolveOrFail(BLjava/lang/Class;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (48 bytes) @ 0x00007f54fbcffab4 [0x00007f54fbcff6e0+0x00000000000003d4]
J 2058 c1 java.lang.invoke.InnerClassLambdaMetafactory.buildCallSite()Ljava/lang/invoke/CallSite; java.base@11.0.8 (168 bytes) @ 0x00007f54fbba916c [0x00007f54fbba82e0+0x0000000000000e8c]
J 2357 c1 java.lang.invoke.LambdaMetafactory.altMetafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite; java.base@11.0.8 (287 bytes) @ 0x00007f54fbc43314 [0x00007f54fbc41f60+0x00000000000013b4]
J 15481 c2 java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@11.0.8 (20 bytes) @ 0x00007f550334bcd8 [0x00007f550334bca0+0x0000000000000038]
J 2356 c1 java.lang.invoke.DelegatingMethodHandle$Holder.delegate(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; java.base@11.0.8 (23 bytes) @ 0x00007f54fbc41644 [0x00007f54fbc411e0+0x0000000000000464]
J 1876 c1 java.lang.invoke.BootstrapMethodInvoker.invoke(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/Object; java.base@11.0.8 (688 bytes) @ 0x00007f54fbb3d58c [0x00007f54fbb3a020+0x000000000000356c]
J 1875 c1 java.lang.invoke.CallSite.makeSite(Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;Ljava/lang/Class;)Ljava/lang/invoke/CallSite; java.base@11.0.8 (91 bytes) @ 0x00007f54fbb34244 [0x00007f54fbb341c0+0x0000000000000084]
J 1874 c1 java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(Ljava/lang/Class;Ljava/lang/invoke/MethodHandle;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (44 bytes) @ 0x00007f54fbb2fa2c [0x00007f54fbb2f9c0+0x000000000000006c]
J 1873 c1 java.lang.invoke.MethodHandleNatives.linkCallSite(Ljava/lang/Object;ILjava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/invoke/MemberName; java.base@11.0.8 (66 bytes) @ 0x00007f54fbb2f454 [0x00007f54fbb2f000+0x0000000000000454]
v  ~StubRoutines::call_stub
j  org.apache.spark.scheduler.TaskSetManager.maybeFinishTaskSet()V+39
j  org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(JLorg/apache/spark/scheduler/DirectTaskResult;)V+341
J 19438 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3.$anonfun$run$1(Lorg/apache/spark/scheduler/TaskResultGetter$$anon$3;Ljava/lang/Object;)V (810 bytes) @ 0x00007f54fdf1bd9c [0x00007f54fdf17820+0x000000000000457c]
J 19437 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3$$Lambda$2884.apply$mcV$sp()V (12 bytes) @ 0x00007f54fdf018c4 [0x00007f54fdf01840+0x0000000000000084]
J 18878 c2 scala.runtime.java8.JFunction0$mcV$sp.apply()Ljava/lang/Object; (10 bytes) @ 0x00007f550355e5dc [0x00007f550355e5a0+0x000000000000003c]
J 19257 c1 org.apache.spark.util.Utils$.logUncaughtExceptions(Lscala/Function0;)Ljava/lang/Object; (66 bytes) @ 0x00007f54fde87ab4 [0x00007f54fde879a0+0x0000000000000114]
J 19290 c1 org.apache.spark.scheduler.TaskResultGetter$$anon$3.run()V (47 bytes) @ 0x00007f54fde961ac [0x00007f54fde95d40+0x000000000000046c]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@11.0.8
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@11.0.8
j  java.lang.Thread.run()V+11 java.base@11.0.8
v  ~StubRoutines::call_stub

siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 0x0000000000000000

Register to memory mapping:

RAX=0x00007f55184749b8 points into unknown readable memory: 70 64 06 f0 51 7f 00 00
RBX={method} {0x00007f515c5e7840} 'get$Lambda' '(Lorg/apache/spark/scheduler/TaskSetManager;)Lscala/Function1;' in 'org/apache/spark/scheduler/TaskSetManager$$Lambda$2962'
RCX=0x7fa5f0a12000d16c is an unknown value
RDX=0x0000000000000005 is an unknown value
RSP=0x00007f51a5df8fd0 is pointing into the stack for thread: 0x00007f51f0095000
RBP=0x00007f51a5df9020 is pointing into the stack for thread: 0x00007f51f0095000
RSI=0x0000000000000005 is an unknown value
RDI=0x00007f5518474950 points into unknown readable memory: ef 03 00 00 00 00 00 00
R8 ={method} {0x00007f515c5e7840} 'get$Lambda' '(Lorg/apache/spark/scheduler/TaskSetManager;)Lscala/Function1;' in 'org/apache/spark/scheduler/TaskSetManager$$Lambda$2962'
R9 =0x0000000000000005 is an unknown value
R10=0x0000000000000065 is an unknown value
R11=0x000001fd47c00cb2 is an unknown value
R12=0x0000000000000005 is an unknown value
R13=0x00000000606ce22e is an unknown value
R14={method} {0x00007f515c5e7840} 'get$Lambda' '(Lorg/apache/spark/scheduler/TaskSetManager;)Lscala/Function1;' in 'org/apache/spark/scheduler/TaskSetManager$$Lambda$2962'
R15=0x7fa5f0a12000d16c is an unknown value

petro-rudenko commented 3 years ago

Can you please try with export UCX_ERROR_SIGNALS= on driver and --conf spark.executorEnv.UCX_ERROR_SIGNALS= in spark conf.

stevenybw commented 3 years ago

After add export UCX_ERROR_SIGNALS= and export SPARK_UCX_HOME=$HOME/sparkucx/target together with Spark configuration, the problem still exists. JVM terminates with SIGSEGV with the following stack track:

---------------  T H R E A D  ---------------

Current thread (0x00007fc2fc02e000):  GCTaskThread "GC Thread#27" [stack: 0x00007fc2cd2b9000,0x00007fc2cd3b9000] [id=261425]

Stack: [0x00007fc2cd2b9000,0x00007fc2cd3b9000],  sp=0x00007fc2cd3b7b70,  free space=1018k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x5dbe40]  ClassLoaderDataGraph::roots_cld_do(CLDClosure*, CLDClosure*)+0x20
V  [libjvm.so+0x7d2a15]  G1RootProcessor::process_java_roots(G1RootClosures*, G1GCPhaseTimes*, unsigned int)+0x65
V  [libjvm.so+0x7d319e]  G1RootProcessor::evacuate_roots(G1ParScanThreadState*, unsigned int)+0x9e
V  [libjvm.so+0x78285c]  G1ParTask::work(unsigned int)+0xec
V  [libjvm.so+0xea176d]  GangWorker::loop()+0x4d
V  [libjvm.so+0xe0acaa]  Thread::call_run()+0x13a
V  [libjvm.so+0xc5293e]  thread_native_entry(Thread*)+0xee

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00000000000001d0

stevenybw commented 3 years ago

We try to construct a smaller dataset (11GB), shrinking the size about 4x, the result is correct and no more SIGSEGV. However, every time we try to experiment with larger dataset (44GB), the problem can be stably reproduced. This would suggest that somewhere buffer has overflow.

petro-rudenko commented 3 years ago

Can you please send hs_err.log file with whole stacktrace.

stevenybw commented 3 years ago

hs_err_pid261284.log

petro-rudenko commented 3 years ago

Does it work with the same parameters and not using ucx? Does it work with -XX:+UseParallelGC?

stevenybw commented 3 years ago

Does it work with the same parameters and not using ucx?

Yes. I've verified that just now, without sparkucx it is fine.

Does it work with -XX:+UseParallelGC?

No. Adding "-XX:+UseParallelGC" to both the driver and the executor won't help, the problem still exists.

petro-rudenko commented 3 years ago

Does it happen at the beginning of the job, at map phase or reduce phase or at the end? Do you see something in dmesg? SparkUCX is doing memory mapping of the file, so it requires a lot of virtual memory. Can you try with `--conf spark.shuffle.ucx.memory.useOdp=true

stevenybw commented 3 years ago

Does it happen at the beginning of the job, at map phase or reduce phase or at the end?

Some tasks can complete, for example, 447 succeed among 448 tasks in the map stage but the last failed. Both could happen, sometimes map phase, sometimes reduce phase.

Do you see something in dmesg?

Rarely we may observe the following triggered by SparkUCX on the driver side.

2149 [Tue May 11 22:16:34 2021] ib_umem_get: failed to get user pages, nr_pages=512
2150 [Tue May 11 22:16:34 2021] mlx5_0:mr_umem_get:713:(pid 3268577): umem get failed (-512)

Can you try with `--conf spark.shuffle.ucx.memory.useOdp=true

The problem still exists.

Update

We found an interesting phenomenon that when increasing the number of reducers from 224 to 448, the word count can produce correct result. Moreover, for 224-reducer configuration, it will always ends with SIGSEGV; for 448-reducer configuration, it will always produce correct result. From 224 to 448 partition, the message size from each mapper to each reducer is reduced. We guess it is possible that somewhere a fixed-size buffer has overflow. Hope this information is useful.

openucx / sparkucx

SIGSEGV in JVM Runtime #30

Configuration

Spark launch commandline

Scala application:

Phenomena

Update