eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.29k stars 722 forks source link

StackOverflowError in Renaissance Benchmark Suite's movie-lens sub-benchmark in Eclipse OpenJ9 0.21 Early Access VM on AArch64 #10370

Closed aarongraham9 closed 3 years ago

aarongraham9 commented 4 years ago

Java -version output

Khadas' VIM3:

/opt/jdk11/openj9-0.21-jdk-11.0.8+10/bin/java -version
openjdk version "11.0.8-ea" 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8-ea+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.21.0, JRE 11 Linux aarch64-64-Bit Compressed References 20200715_419 (JIT enabled, AOT enabled)
OpenJ9   - 34cf4c075
OMR      - 113e54219
JCL      - 95bb504fbb based on jdk-11.0.8+10)

Pine64's Rock64:

/opt/jdk/openj9-0.21-jdk-11.0.8+10/bin/java -version
openjdk version "11.0.8-ea" 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8-ea+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.21.0, JRE 11 Linux aarch64-64-Bit Compressed References 20200715_419 (JIT enabled, AOT enabled)
OpenJ9   - 34cf4c075
OMR      - 113e54219
JCL      - 95bb504fbb based on jdk-11.0.8+10)

Summary of problem

I may have found a defect in the JIT for AArch64 in the Eclipse OpenJ9 0.21 release. When I am running the Renaissance Benchmark Suite's (https://renaissance.dev/) movie-lens sub-benchmark (from apache-spark) I am encountering a java.lang.StackOverflowError that causes the VM to lock-up, never to come back. I first encountered this issue on the VIM3 AArch64 board but successfully reproduced it on the Rock64 AArch64 board. I have also confirmed that this only occurs with the JIT and does not occur when running only the interpreter (-Xint). The interpreter runs on the VIM3 and Rock64 complete successfully. Commands and output for the failing runs can be found below. Could this have something to do with Java reflection support in the AArch64 JIT?

Diagnostic files

Java command: /opt/jdk11/openj9-0.21-jdk-11.0.8+10/bin/java -Xms2G -Xmx2G -Xgcpolicy:optthruput -Xgcthreads2 -Xenableexcessivegc -Xgc:excessiveGCratio=95 -jar renaissance-gpl-0.11.0.jar movie-lens

VIM3 output:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/workspace/aarch64_omr_benchmark/Benchmarks/RenaissanceSuite/./tmp-jars-12125388843329439956/cp-hadoop-auth-2.2.0.jar9055683535524135096.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/08/11 00:37:57 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/08/11 00:37:58 INFO PlatformDependent: Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system unstability.

[Stage 1:>                                                          (0 + 2) / 2]
... snip ...
[Stage 2205:============================>                           (2 + 2) / 4]

20/08/11 00:40:52 ERROR Executor: Exception in task 1.0 in stage 2280.0 (TID 2488)
java.lang.StackOverflowError
    at java.base/java.io.ObjectInputStream.redirectedReadObject(ObjectInputStream.java:496)
    at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
    at jdk.internal.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2422)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2563)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2455)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2563)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2455)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:566)
    at java.base/java.io.ObjectInputStream.redirectedReadObject(ObjectInputStream.java:496)
    at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
    at jdk.internal.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
... snip ...
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2422)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2563)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2455)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2563)
    at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2455)
    at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2288)
    at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1790)
    at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:566)
    at java.base/java.io.ObjectInputStream.redirectedReadObject(ObjectInputStream.java:496)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:71)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:836)

Full benchmark output from the VIM3 can be found here: vim3.renaissance.apache-spark.movie-lens.java.lang.StackOverflowError.log.

Java command: /opt/jdk/openj9-0.21-jdk-11.0.8+10/bin/java -Xms2G -Xmx2G -Xgcpolicy:optthruput -Xgcthreads2 -Xenableexcessivegc -Xgc:excessiveGCratio=95 -jar renaissance-gpl-0.11.0.jar movie-lens

Rock64 output:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/home/casaadmin/aarch64_omr_benchmark/Benchmarks/RenaissanceSuite/./tmp-jars-9863479237288750788/cp-hadoop-auth-2.2.0.jar1576351611884303847.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/08/10 16:37:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/08/10 16:37:28 INFO PlatformDependent: Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system unstability.

[Stage 1:>                                                          (0 + 2) / 2]
 ... snip ...
[Stage 202:>                                                        (0 + 0) / 4]
[Stage 204:>                                                        (0 + 0) / 4]Exception in thread "dag-scheduler-event-loop" java.lang.StackOverflowError
    at scala.collection.immutable.List$SerializationProxy.writeObject(List.scala:468)
    at jdk.internal.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at java.base/java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1145)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1497)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
    at scala.collection.immutable.List$SerializationProxy.writeObject(List.scala:468)
    at jdk.internal.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
... snip ...
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at java.base/java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1145)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1497)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
    at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
    at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
    at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
    at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
    at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1007)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:933)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$submitWaitingStages$6.apply(DAGScheduler.scala:772)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$submitWaitingStages$6.apply(DAGScheduler.scala:771)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at org.apache.spark.scheduler.DAGScheduler.submitWaitingStages(DAGScheduler.scala:771)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1318)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1656)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

Full benchmark output from the Rock64 can be found here: rock64.renaissance.apache-spark.movie-lens.java.lang.StackOverflowError.log.

aarongraham9 commented 4 years ago

FYI @0xdaryl and @knn-k.

aarongraham9 commented 4 years ago

I've confirmed that this doesn't seem to be an issue on an x86_64 machine I have access to.

/opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java --version:

openjdk 11.0.8 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.21.0, JRE 11 Linux amd64-64-Bit Compressed References 20200715_697 (JIT enabled, AOT enabled)
OpenJ9   - 34cf4c075
OMR      - 113e54219
JCL      - 95bb504fbb based on jdk-11.0.8+10)

Runs successfully with the Interpreter only (-Xint) and with the JIT.

aarongraham9 commented 4 years ago

For what it is worth (as this appears to be a stack size issue), it doesn't seem to be a matter of heap size. I've done additional JIT runs on the VIM3 and the Rock64 with increasing -Xms and -Xmx from 2G to 3G and then 4G. Still fails the same.

aarongraham9 commented 4 years ago

I was able to send the process a SIGABORT signal to trigger the VM to generate some dump files. I'll link the dump files from the VIM3 on my UNB OneDrive here:

aarongraham9 commented 4 years ago

Any thoughts on what I can do next to dig into this potential defect?

knn-k commented 4 years ago

I reproduced the StackOverflowError with jdk-11.0.8+10/bin/java -jar renaissance-gpl-0.11.0.jar movie-lens on my local device, with both the large heap build and the compressed refs build.

The April release (v0.20.0) below seems to work fine:

$ jdk-11.0.7+10/bin/java -version
openjdk version "11.0.7-ea" 2020-04-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.7-ea+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.20.0, JRE 11 Linux aarch64-64-Bit Compressed References 20200416_286 (JIT enabled, AOT enabled)
OpenJ9   - 05fa2d361
OMR      - d4365f371
JCL      - 838028fc9d based on jdk-11.0.7+10)
knn-k commented 4 years ago

Running the benchmark with -Xjit:exclude={java/io/Object*.*} seems to be OK.

0xdaryl commented 4 years ago

This may not necessarily be a bug, but we're simply exhausting the available Java thread stack memory due to the presence of larger JIT frames and deep recursion (both of which seem to be in play here).

What does running with -verbose:sizes tell you on both the AArch64 board and the x86 machine for the Java thread stack sizes?

If you increase the Java thread stack memory limit via the-Xss option (e.g., -Xss2M or other values) will that make the benchmark pass?

If you can't get it to pass with increased -Xss settings that might indicate a real problem. And even if you can get it to pass that doesn't mean there isn't room for improvement in terms of the sizes of the JIT frames. We'll have to study the methods on the stack here to understand what we're dealing with. The value that you end up setting the thread memory to will give us a clue as to whether this is something to look into further.

I'll also point out that EA was enabled recently on AArch64 which can lead to increased stack memory usage. There may be a problem with the way AArch64 maps local objects, but there is no evidence of that yet. You can try running with -Xjit:disableEscapeAnalysis to observe the effect of running without it.

aarongraham9 commented 4 years ago

Ok, thanks! I'll check those options and let you know what comes back for output and results. :smile:

aarongraham9 commented 4 years ago

They all seem to be using 1M stack memory. I'll try increasing it and see if that helps.

VIM3: /opt/jdk11/openj9-0.21-jdk-11.0.8+10/bin/java -verbose:sizes:

  -Xmca32K        RAM class segment increment
  -Xmco128K       ROM class segment increment
  -Xmcrs200M      compressed references metadata initial size
  -Xmns2M         initial new space size
  -Xmnx240256K    maximum new space size
  -Xms8M          initial memory size
  -Xmos6M         initial old space size
  -Xmox959104K    maximum old space size
  -Xmx961152K     memory maximum
  -Xmr16K         remembered set size
  -Xlp:objectheap:pagesize=4K    large page size
                  available large page sizes:
                  4K
  -Xlp:codecache:pagesize=4K     large page size for JIT code cache
                  available large page sizes for JIT code cache:
                  4K
  -Xmso256K       operating system thread stack size
  -Xiss2K         java thread stack initial size
  -Xssi16K        java thread stack increment
  -Xss1M          java thread stack maximum size
  -XX:SharedCacheHardLimit=64M shared class cache size
  -Xscmx67108256  shared class cache soft max size
  -Xscdmx5212K    reserved shared class cache space for class debug attributes
  -Xscminaot0K    min reserved shared class cache space for AOT
  -Xscmaxaot59968K max allowed shared class cache space for AOT
  -Xscminjitdata0K min reserved shared class cache space for JIT data
  -Xscmaxjitdata59968K max allowed shared class cache space for JIT data

Rock64: /opt/jdk/openj9-0.21-jdk-11.0.8+10/bin/java -verbose:sizes:

  -Xmca32K        RAM class segment increment                                                                         
  -Xmco128K       ROM class segment increment                                                                         
  -Xmcrs200M      compressed references metadata initial size                                                         
  -Xmns2M         initial new space size                                                                              
  -Xmnx254336K    maximum new space size                                                                              
  -Xms8M          initial memory size                                                                                 
  -Xmos6M         initial old space size                                                                              
  -Xmox1015424K   maximum old space size                                                                              
  -Xmx1017472K    memory maximum                                                                                      
  -Xmr16K         remembered set size                                                                                 
  -Xlp:objectheap:pagesize=4K    large page size                                                                      
                  available large page sizes:                                                                         
                  4K                                                                                                  
  -Xlp:codecache:pagesize=4K     large page size for JIT code cache                                                   
                  available large page sizes for JIT code cache:                                                      
                  4K                                                                                                  
  -Xmso256K       operating system thread stack size                                                                  
  -Xiss2K         java thread stack initial size                                                                      
  -Xssi16K        java thread stack increment                                                                         
  -Xss1M          java thread stack maximum size                                                                      
  -XX:SharedCacheHardLimit=64M shared class cache size                                                                
  -Xscmx64M       shared class cache soft max size                                                                    
  -Xscdmx5212K    reserved shared class cache space for class debug attributes                                        
  -Xscminaot0K    min reserved shared class cache space for AOT                                                       
  -Xscmaxaot59968K max allowed shared class cache space for AOT                                                       
  -Xscminjitdata0K min reserved shared class cache space for JIT data                                                 
  -Xscmaxjitdata59968K max allowed shared class cache space for JIT data

x86_64: /opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java -verbose:sizes:

/opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java -verbose:sizes                                                      127 ↵  2020-08-17 11:27:16
  -Xmca32K        RAM class segment increment
  -Xmco128K       ROM class segment increment
  -Xmcrs200M      compressed references metadata initial size
  -Xmns2M         initial new space size
  -Xmnx2046080K   maximum new space size
  -Xms8M          initial memory size
  -Xmos6M         initial old space size
  -Xmox8182720K   maximum old space size
  -Xmx8184768K    memory maximum
  -Xmr16K         remembered set size
  -Xlp:objectheap:pagesize=4K    large page size
                  available large page sizes:
                  4K
  -Xlp:codecache:pagesize=4K     large page size for JIT code cache
                  available large page sizes for JIT code cache:
                  4K
  -Xmso256K       operating system thread stack size
  -Xiss2K         java thread stack initial size
  -Xssi16K        java thread stack increment
  -Xss1M          java thread stack maximum size
  -XX:SharedCacheHardLimit=300M shared class cache size
  -Xscmx64M       shared class cache soft max size
  -Xscdmx24544K   reserved shared class cache space for class debug attributes
  -Xscminaot0K    min reserved shared class cache space for AOT
  -Xscmaxaot282300K max allowed shared class cache space for AOT
  -Xscminjitdata0K min reserved shared class cache space for JIT data
  -Xscmaxjitdata282300K max allowed shared class cache space for JIT data
aarongraham9 commented 4 years ago

You were right! I increased -Xss to 2M and both the runs on the VIM3 and Rock64 completed successfully. I'll also do a run turning off Escape Analysis and keeping -Xss at 1M as well and see what happens.

aarongraham9 commented 4 years ago

With escape analysis off (-Xjit:disableEscapeAnalysis) and keeping -Xss at 1M it StackOverflowErrors as before on both the VIM3 and Rock64.

aarongraham9 commented 4 years ago

As I expected, with escape analysis off (-Xjit:disableEscapeAnalysis) and setting -Xss to 2M it also runs fine on the VIM3 and Rock64 boards.

knn-k commented 4 years ago

Does the results above mean the AArch64 JIT consumes more Java stack space than the x86 JIT does? If yes, is #5910 (locals compaction) related?

0xdaryl commented 4 years ago

That may help certain methods with large frames (which may very well be the case in a benchmark), and you can disable it on x86 and see if that has any effect (-Xjit:disableCompactLocals).

AArch64 has nearly twice as many GPRs and FPRs as x86 and this may manifest itself as more spilling required for preserved and volatile registers across calls.

If you want to do a side-by-side comparison to get to the bottom of this (probably a reasonable exercise given that we haven't studied dynamic footprint of methods on AArch64 yet) I think you'd have to dump the frame size for each method on AArch64 and x86, look for glaring differences, and drill deeper into those methods (via logs). I don't think there is a way of dumping the frame size directly, but I think it is a useful statistic to add to the verbose log output. Perhaps this investigation is something @aarongraham9 could do.

DanHeidinga commented 4 years ago

Does aarch64 have DDR support yet? You should be able to get a lot of this information from a !stack <0xJ9Thread> or a !stackslots <0xJ9Thread> as it walks the stack and pretty prints it. It may need some calculation or worse come to worse, some hacking on the DDR stackwalk extension.

aarongraham9 commented 4 years ago

Yeah, I'm happy to investigate doing this. Any pointers on where to begin looking to add the frame size info to the verbose log?

I haven't used DDR in about 6 years now (in closed J9), can you point me to some good "getting started" documentation for DDR with OpenJ9?

DanHeidinga commented 4 years ago

I haven't used DDR in about 6 years now (in closed J9), can you point me to some good "getting started" documentation for DDR with OpenJ9?

Sadly, I'm not sure it exists. If you run the test with: -Xdump:java+system:events=throw+systhrow,filter=java/lang/StackOverflowError you'll get both a system and java core.

You can load that into ddr using jdmpview -core <corefileName> and then I usually run !findvm to ensure the core is valid. If it doesn't find a VM in the core, then the rest of this won't work.

Then run !threads to list all the threads and you should be able to identify the thread throwing the SOM from the thread names. If not, you can check the javacore for the current thread, it should show the stacktrace and give you the J9VMThread address.

Then run !stackslots <0xJ9VMThread> and you should have the info to calculate the stack sizes

aarongraham9 commented 4 years ago

Ok, thanks! I'll give it a go.

0xdaryl commented 4 years ago

Any pointers on where to begin looking to add the frame size info to the verbose log?

https://github.com/eclipse/openj9/blob/7f749e1cfd2317e01c8ddeccf128368975bc79d3/runtime/compiler/control/CompilationThread.cpp#L10682

knn-k commented 4 years ago

OpenJ9 0.22.0 for Windows runs the movie-lens benchmark with -Xjit:disableCompactLocals and with the default Java stack size (1M).

aarongraham9 commented 4 years ago

OpenJ9 0.21 for Linux on x86_64 with -Xjit:disableCompactLocals specified causes a StackOverflowError with the default Java stack size (1M).

aarongraham9 commented 3 years ago

Can no longer reproduce in newer OpenJ9 Early Access releases on AArch64. Closing.