Closed JasonFengJ9 closed 2 weeks ago
This doesn't seem a new failure, it's probably been happening for quite some time, perhaps forever, but the history doesn't go back that far. @keithc-ca fyi, if you want to investigate..
The fix for ArrayIndexOutOfBoundsException may expose the next problem.
trace_history.pl>> 2024 Mon Jun 17 08:05:25 Comparing current thread trace in javacore.txt and snap.trc.fmt
trace_history.pl>> 2024 Mon Jun 17 08:05:25 Trace point lines did not match:j9mm.668 vs base
trace_history.pl>> 2024 Mon Jun 17 08:05:25 Javacore: 15:05:11:096712379 GMT j9mm.668 - >MSSSS::allocate type Object size 40360 subspace this 197146201bfef1d4/
trace_history.pl>> 2024 Mon Jun 17 08:05:25 Snap trace: ,x??p? base 0x1971462000000000 prev 0x0 shouldCollectOnFailure 426740000
trace_history.pl>> 2024 Mon Jun 17 08:05:25 TestJavaCoreAndSnap: failed
The test is looking for 3XEHSTTYPE which seems to not always exist in the javacore. In a ArrayIndexOutOfBoundsException failing example, the javacore contains only the header
1XECTHTYPE Current thread history (J9VMThread:0x3D236100)
1XMCURTHDINFO Current thread
3XMTHREADINFO "main" J9VMThread:0x3D236100, omrthread_t:0x19C6E310, java/lang/Thread:0x3C722870, state:R, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0x1, isDaemon:false)
3XMJAVALTHRCCL sun/misc/Launcher$AppClassLoader(0x3C701588)
3XMTHREADINFO1 (native thread ID:0x2, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00041020)
3XMCPUTIME CPU usage total: 0.165000000 secs, current category="Application"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=380096 (0x5CCC0)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at com/ibm/jvm/Dump.JavaDumpImpl(Native Method)
4XESTACKTRACE at com/ibm/jvm/Dump.JavaDump(Dump.java:123)
4XESTACKTRACE at com/ibm/trace/tests/apptrace/GenerateJavaCoreAndSnap.main(GenerateJavaCoreAndSnap.java:14)
3XMTHREADINFO3 No native callstack available on this platform
trace_history.pl>> 2024 Mon Jun 17 08:05:25 Trace point lines did not match:j9mm.668 vs base
This occurred before the merging https://github.com/eclipse-openj9/openj9/pull/19712 at https://github.com/eclipse-openj9/openj9/issues/19703#issue-2351873394
This occurred before the merging https://github.com/eclipse-openj9/openj9/pull/19712 at https://github.com/eclipse-openj9/openj9/issues/19703#issue-2351873394
It doesn't matter. The "fix" will get rid of ArrayIndexOutOfBoundsException but not all the failures.
Here is the javacore content that matches https://github.com/eclipse-openj9/openj9/issues/19703#issuecomment-2173691862.
Seems like an EBCDIC conversion problem.
Snap: 15:05:11.096328626 0x3ceb6100 j9mm.668 Entry >MSSSS::allocate type Object size 40360 subspace this 0x197146201bfef1d4/� ,x�??�p? base 0x1971462000000000 prev 0x0 shouldCollectOnFailure 426740000
javacore: 3XEHSTTYPE 15:05:11:096712379 GMT j9mm.668 - >MSSSS::allocate type Object size 40360 subspace this 197146201bfef1d4/ ,x?? base 1971462000000000 prev 0 shouldCollectOnFailure 426740000
Also the format is different. In the formatted Snap you get 0x0
and javacore you get 0
.
I think the tracepoint is misused or ill-defined
TraceEntry=Trc_MM_MSSSS_allocate_entry Overhead=1 Level=1 Group=allocate Template="MSSSS::allocate type %s size %zu subspace this %llx/%s base %llx prev %llx shouldCollectOnFailure %zu"
The third argument is expected to be %llx
, but is only provided a (32-bit) pointer (where it fails):
Trc_MM_MSSSS_allocate_entry(env->getLanguageVMThread(), "Object", allocDescription->getBytesRequested(), this, getName(), baseSubSpace, previousSubSpace, (uintptr_t)shouldCollectOnFailure);
The template should have used %zx
instead of %llx
, but rather than change the template, we can fix the use.
On the other hand, changing the template, would correct the interpretation of existing trace data.
Correct, it should be %zx (not only that one, but at a few other similar places). I can (and prefer to) fix format.
I can (and prefer to) fix format.
Should we interpret that to mean that you will create pull requests to fix them?
I can (and prefer to) fix format.
Should we interpret that to mean that you will create pull requests to fix them?
Passed in recent builds.
@amicic do you want to add https://github.com/eclipse/omr/pull/7381 to the 0.46 release?
@amicic do you want to add eclipse/omr#7381 to the 0.46 release?
Failure link
From an internal build(
inec015
):Optional info
Failure output (captured from console output)
100x internal grinder - 20/100 failed Others failures
A similar failure [Windows IA32] 80 JVM_Functionality.RAS Trace Non-Smoke.Mode110.1