Closed aarongraham9 closed 3 years ago
FYI @0xdaryl and @knn-k.
I've confirmed that this doesn't seem to be an issue on an x86_64 machine I have access to.
/opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java --version
:
openjdk 11.0.8 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.21.0, JRE 11 Linux amd64-64-Bit Compressed References 20200715_697 (JIT enabled, AOT enabled)
OpenJ9 - 34cf4c075
OMR - 113e54219
JCL - 95bb504fbb based on jdk-11.0.8+10)
Runs successfully with the Interpreter only (-Xint
) and with the JIT.
For what it is worth (as this appears to be a stack size issue), it doesn't seem to be a matter of heap size. I've done additional JIT runs on the VIM3 and the Rock64 with increasing -Xms
and -Xmx
from 2G
to 3G
and then 4G
. Still fails the same.
I was able to send the process a SIGABORT
signal to trigger the VM to generate some dump files. I'll link the dump files from the VIM3
on my UNB OneDrive here:
Any thoughts on what I can do next to dig into this potential defect?
I reproduced the StackOverflowError with jdk-11.0.8+10/bin/java -jar renaissance-gpl-0.11.0.jar movie-lens
on my local device, with both the large heap build and the compressed refs build.
The April release (v0.20.0) below seems to work fine:
$ jdk-11.0.7+10/bin/java -version
openjdk version "11.0.7-ea" 2020-04-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.7-ea+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.20.0, JRE 11 Linux aarch64-64-Bit Compressed References 20200416_286 (JIT enabled, AOT enabled)
OpenJ9 - 05fa2d361
OMR - d4365f371
JCL - 838028fc9d based on jdk-11.0.7+10)
Running the benchmark with -Xjit:exclude={java/io/Object*.*}
seems to be OK.
This may not necessarily be a bug, but we're simply exhausting the available Java thread stack memory due to the presence of larger JIT frames and deep recursion (both of which seem to be in play here).
What does running with -verbose:sizes
tell you on both the AArch64 board and the x86 machine for the Java thread stack sizes?
If you increase the Java thread stack memory limit via the-Xss
option (e.g., -Xss2M
or other values) will that make the benchmark pass?
If you can't get it to pass with increased -Xss
settings that might indicate a real problem. And even if you can get it to pass that doesn't mean there isn't room for improvement in terms of the sizes of the JIT frames. We'll have to study the methods on the stack here to understand what we're dealing with. The value that you end up setting the thread memory to will give us a clue as to whether this is something to look into further.
I'll also point out that EA was enabled recently on AArch64 which can lead to increased stack memory usage. There may be a problem with the way AArch64 maps local objects, but there is no evidence of that yet. You can try running with -Xjit:disableEscapeAnalysis
to observe the effect of running without it.
Ok, thanks! I'll check those options and let you know what comes back for output and results. :smile:
They all seem to be using 1M
stack memory. I'll try increasing it and see if that helps.
VIM3: /opt/jdk11/openj9-0.21-jdk-11.0.8+10/bin/java -verbose:sizes
:
-Xmca32K RAM class segment increment
-Xmco128K ROM class segment increment
-Xmcrs200M compressed references metadata initial size
-Xmns2M initial new space size
-Xmnx240256K maximum new space size
-Xms8M initial memory size
-Xmos6M initial old space size
-Xmox959104K maximum old space size
-Xmx961152K memory maximum
-Xmr16K remembered set size
-Xlp:objectheap:pagesize=4K large page size
available large page sizes:
4K
-Xlp:codecache:pagesize=4K large page size for JIT code cache
available large page sizes for JIT code cache:
4K
-Xmso256K operating system thread stack size
-Xiss2K java thread stack initial size
-Xssi16K java thread stack increment
-Xss1M java thread stack maximum size
-XX:SharedCacheHardLimit=64M shared class cache size
-Xscmx67108256 shared class cache soft max size
-Xscdmx5212K reserved shared class cache space for class debug attributes
-Xscminaot0K min reserved shared class cache space for AOT
-Xscmaxaot59968K max allowed shared class cache space for AOT
-Xscminjitdata0K min reserved shared class cache space for JIT data
-Xscmaxjitdata59968K max allowed shared class cache space for JIT data
Rock64: /opt/jdk/openj9-0.21-jdk-11.0.8+10/bin/java -verbose:sizes
:
-Xmca32K RAM class segment increment
-Xmco128K ROM class segment increment
-Xmcrs200M compressed references metadata initial size
-Xmns2M initial new space size
-Xmnx254336K maximum new space size
-Xms8M initial memory size
-Xmos6M initial old space size
-Xmox1015424K maximum old space size
-Xmx1017472K memory maximum
-Xmr16K remembered set size
-Xlp:objectheap:pagesize=4K large page size
available large page sizes:
4K
-Xlp:codecache:pagesize=4K large page size for JIT code cache
available large page sizes for JIT code cache:
4K
-Xmso256K operating system thread stack size
-Xiss2K java thread stack initial size
-Xssi16K java thread stack increment
-Xss1M java thread stack maximum size
-XX:SharedCacheHardLimit=64M shared class cache size
-Xscmx64M shared class cache soft max size
-Xscdmx5212K reserved shared class cache space for class debug attributes
-Xscminaot0K min reserved shared class cache space for AOT
-Xscmaxaot59968K max allowed shared class cache space for AOT
-Xscminjitdata0K min reserved shared class cache space for JIT data
-Xscmaxjitdata59968K max allowed shared class cache space for JIT data
x86_64: /opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java -verbose:sizes
:
/opt/jdk/jdk-11.0.8.10_openj9_0.20.0/bin/java -verbose:sizes 127 ↵ 2020-08-17 11:27:16
-Xmca32K RAM class segment increment
-Xmco128K ROM class segment increment
-Xmcrs200M compressed references metadata initial size
-Xmns2M initial new space size
-Xmnx2046080K maximum new space size
-Xms8M initial memory size
-Xmos6M initial old space size
-Xmox8182720K maximum old space size
-Xmx8184768K memory maximum
-Xmr16K remembered set size
-Xlp:objectheap:pagesize=4K large page size
available large page sizes:
4K
-Xlp:codecache:pagesize=4K large page size for JIT code cache
available large page sizes for JIT code cache:
4K
-Xmso256K operating system thread stack size
-Xiss2K java thread stack initial size
-Xssi16K java thread stack increment
-Xss1M java thread stack maximum size
-XX:SharedCacheHardLimit=300M shared class cache size
-Xscmx64M shared class cache soft max size
-Xscdmx24544K reserved shared class cache space for class debug attributes
-Xscminaot0K min reserved shared class cache space for AOT
-Xscmaxaot282300K max allowed shared class cache space for AOT
-Xscminjitdata0K min reserved shared class cache space for JIT data
-Xscmaxjitdata282300K max allowed shared class cache space for JIT data
You were right! I increased -Xss
to 2M
and both the runs on the VIM3 and Rock64 completed successfully. I'll also do a run turning off Escape Analysis and keeping -Xss
at 1M
as well and see what happens.
With escape analysis off (-Xjit:disableEscapeAnalysis
) and keeping -Xss
at 1M
it StackOverflowError
s as before on both the VIM3 and Rock64.
As I expected, with escape analysis off (-Xjit:disableEscapeAnalysis
) and setting -Xss
to 2M
it also runs fine on the VIM3 and Rock64 boards.
Does the results above mean the AArch64 JIT consumes more Java stack space than the x86 JIT does? If yes, is #5910 (locals compaction) related?
That may help certain methods with large frames (which may very well be the case in a benchmark), and you can disable it on x86 and see if that has any effect (-Xjit:disableCompactLocals
).
AArch64 has nearly twice as many GPRs and FPRs as x86 and this may manifest itself as more spilling required for preserved and volatile registers across calls.
If you want to do a side-by-side comparison to get to the bottom of this (probably a reasonable exercise given that we haven't studied dynamic footprint of methods on AArch64 yet) I think you'd have to dump the frame size for each method on AArch64 and x86, look for glaring differences, and drill deeper into those methods (via logs). I don't think there is a way of dumping the frame size directly, but I think it is a useful statistic to add to the verbose log output. Perhaps this investigation is something @aarongraham9 could do.
Does aarch64 have DDR support yet? You should be able to get a lot of this information from a !stack <0xJ9Thread>
or a !stackslots <0xJ9Thread>
as it walks the stack and pretty prints it. It may need some calculation or worse come to worse, some hacking on the DDR stackwalk extension.
Yeah, I'm happy to investigate doing this. Any pointers on where to begin looking to add the frame size info to the verbose log?
I haven't used DDR in about 6 years now (in closed J9), can you point me to some good "getting started" documentation for DDR with OpenJ9?
I haven't used DDR in about 6 years now (in closed J9), can you point me to some good "getting started" documentation for DDR with OpenJ9?
Sadly, I'm not sure it exists. If you run the test with: -Xdump:java+system:events=throw+systhrow,filter=java/lang/StackOverflowError
you'll get both a system and java core.
You can load that into ddr using jdmpview -core <corefileName>
and then I usually run !findvm
to ensure the core is valid. If it doesn't find a VM in the core, then the rest of this won't work.
Then run !threads
to list all the threads and you should be able to identify the thread throwing the SOM from the thread names. If not, you can check the javacore for the current thread, it should show the stacktrace and give you the J9VMThread address.
Then run !stackslots <0xJ9VMThread>
and you should have the info to calculate the stack sizes
Ok, thanks! I'll give it a go.
Any pointers on where to begin looking to add the frame size info to the verbose log?
OpenJ9 0.22.0 for Windows runs the movie-lens benchmark with -Xjit:disableCompactLocals
and with the default Java stack size (1M).
OpenJ9 0.21
for Linux on x86_64
with -Xjit:disableCompactLocals
specified causes a StackOverflowError
with the default Java stack size (1M
).
Can no longer reproduce in newer OpenJ9 Early Access releases on AArch64. Closing.
Java -version output
Khadas' VIM3:
Pine64's Rock64:
Summary of problem
I may have found a defect in the JIT for
AArch64
in the Eclipse OpenJ90.21
release. When I am running the Renaissance Benchmark Suite's (https://renaissance.dev/)movie-lens
sub-benchmark (from apache-spark) I am encountering ajava.lang.StackOverflowError
that causes the VM to lock-up, never to come back. I first encountered this issue on the VIM3 AArch64 board but successfully reproduced it on the Rock64 AArch64 board. I have also confirmed that this only occurs with the JIT and does not occur when running only the interpreter (-Xint
). The interpreter runs on the VIM3 and Rock64 complete successfully. Commands and output for the failing runs can be found below. Could this have something to do with Java reflection support in the AArch64 JIT?Diagnostic files
Java command:
/opt/jdk11/openj9-0.21-jdk-11.0.8+10/bin/java -Xms2G -Xmx2G -Xgcpolicy:optthruput -Xgcthreads2 -Xenableexcessivegc -Xgc:excessiveGCratio=95 -jar renaissance-gpl-0.11.0.jar movie-lens
VIM3 output:
Full benchmark output from the VIM3 can be found here: vim3.renaissance.apache-spark.movie-lens.java.lang.StackOverflowError.log.
Java command:
/opt/jdk/openj9-0.21-jdk-11.0.8+10/bin/java -Xms2G -Xmx2G -Xgcpolicy:optthruput -Xgcthreads2 -Xenableexcessivegc -Xgc:excessiveGCratio=95 -jar renaissance-gpl-0.11.0.jar movie-lens
Rock64 output:
Full benchmark output from the Rock64 can be found here: rock64.renaissance.apache-spark.movie-lens.java.lang.StackOverflowError.log.