eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

How to locate a failure using a javacore file #17812

Closed zheng-kai closed 1 year ago

zheng-kai commented 1 year ago

This is not an issue specific to OpenJ9. I am seeking advice on how to locate the cause of a defect using the information within javacore files. My aim is to avoid submitting duplicate issues. For instance, upon reviewing these two Java core files, I noticed they contain valuable information regarding threads and stack traces. However, I have been unable to determine the reason for the generation of these javacore files. javacore.20230716.083304.24027.0002.txt javacore.20230716.122035.10544.0002.txt

Could you kindly advise me on how to compare the contents of the javacore files to ascertain whether they are caused by the same reason?

JasonFengJ9 commented 1 year ago
1TISIGINFO     Dump Event "gpf" (00002000) received

This suggests there was a segmentation error, what's the console output, is there a system core file available?

This is not an issue specific to OpenJ9. Could you kindly advise me on how to compare the contents of the javacore files to ascertain whether they are caused by the same reason?

NULL           ------------------------------------------------------------------------
0SECTION       ENVINFO subcomponent dump routine
NULL           =================================
1CIJAVAVERSION JRE 17 Linux amd64-64 (build 17-internal+0-adhoc..openj9-openjdk-jdk17-0.28.0)
1CIVMVERSION   20230715_000000
1CIJ9VMTAG     openj9-0.28.0-m2
1CIJ9VMVERSION 12ac773fe
1CIJITVERSION  j9jit_20230715_1946_
1CIOMRVERSION  377314cb0_CMPRSS
1CIJCLVERSION  41171df43e2 based on jdk-17+35
1CIVENDOR      Eclipse OpenJ9

openj9-0.28.0-m2 is a bit outdated, a recent release is available at https://github.com/ibmruntimes/semeru17-binaries/releases/tag/jdk-17.0.7%2B7_openj9-0.38.0

2CIENVVAR      JAVA_HOME=/home/zhengkai/jdk/jdk1.8.0_361
2CIENVVAR      CLASSPATH=.:/home/zhengkai/jdk/jdk1.8.0_361/lib/dt.jar:/home/zhengkai/jdk/jdk1.8.0_361/lib/tools.jar
2CIENVVAR      PWD=/home/zhengkai/fuzzer
2CIENVVAR      HOME=/root
2CIENVVAR      JavaLib=/home/zhengkai/jdk/jdk1.8.0_361/lib/*:/home/zhengkai/jdk/jdk1.8.0_361/jre/lib/*

Why jdk1.8.0_361 is used, different JVMs can't be mixed up.

zheng-kai commented 1 year ago

I am currently using an older version of OpenJ9 to experiment with some testing approaches. My objective is to locate the source code position where a segmentation fault occurs by analyzing log information. This will help me gather statistics on the distribution of segmentations across different modules, such as JIT, GC. In the hs_err file of the HotSpot JVM, I can find location information similar to Internal Error (/home/zhengkai/jdk/debug/jdk17u-jdk-17-2/src/hotspot/share/utilities/growableArray.hpp:150). However, it seems that the javacore file does not provide such direct information. Is it possible to identify the module responsible for the crash by examining the Current Thread information in the javacore file?

What's more, I noticed that there is a location information in this javacore file. * ** ASSERTION FAILED ** at c:\workspace\openjdk-build\workspace\build\src\omr\port\common\omrmemtag.c:145: ((memoryCorruptionDetected)) javacore.20230714.150911.73888.0002.txt from this issue Could you please confirm if this functionality was introduced in a specific version? If it is a feature, I would like to know the earliest supported version. Because I plan to experiment with an older version of the JVM. Thank you.

JasonFengJ9 commented 1 year ago

Is it possible to identify the module responsible for the crash by examining the Current Thread information in the javacore file?

OpenJ9 uses VMSTATE to indicate a related module in case of a catastrophic failure. https://github.com/eclipse-openj9/openj9/blob/5ad708e02e19fee424d24ef67702c2f5babdc2a5/runtime/oti/j9nonbuilder.h#L5405-L5415 The state code can be found in javacore file 1XHFLAGS VM flags: In addition, JVM has a builtin utility to lookup a specific code such as

java -Xjit:vmstate=0x00050cff -version
vmState [0x50cff]: {J9VMSTATE_JIT} {localDeadStoreElimination}

What's more, I noticed that there is a location information in this javacore file. * ASSERTION FAILED at c:\workspace\openjdk-build\workspace\build\src\omr\port\common\omrmemtag.c:145: ((memoryCorruptionDetected)) javacore.20230714.150911.73888.0002.txt from this https://github.com/eclipse-openj9/openj9/issues/17793 Could you please confirm if this functionality was introduced in a specific version? If it is a feature, I would like to know the earliest supported version. Because I plan to experiment with an older version of the JVM.

This is related to JEP 358 implemented since Java 14, but the ASSERTION failure (memoryCorruptionDetected) only appears after fixing a memory leak via https://github.com/eclipse-openj9/openj9/pull/15550, a fix is in progress.

zheng-kai commented 1 year ago

Thank you very much!