Open dylanjtuttle opened 11 months ago
Annabelle @a7ehuo, may I ask you to investigate the assertion failure in OMR::Symbol::castToAutoSymbol
? I think the subsequent assertion failures in Tree.cpp
and J9MonitorTable.cpp
can be investigated as separate issues.
So far I'm not able to repro the assert locally on Power. I have build with PROD_ASSUMES enabled. The test passes even though JCL_TEST_Test-Annotation-Package
shows TestException [1]. There is no information or dump from TR_ASSERT.
Both the two jitdmp logs from JCL_Test_none_SCC_0
and JCL_Test_none_SCC_1
contain TREE VERIFICATION ERROR
for su2i
node. The jitdmp logs only have ILTrees during crash. There is no history to look at where the node is created/modified. The trees look from post GRA.
[1]
===============================================
Running test JCL_Test_none_SCC_1 ...
===============================================
...
...
===============================================
JCL_TEST_Test-Annotation-Package
Tests run: 2, Failures: 0, Skips: 0
===============================================
org.openj9.test.java.lang.
Exception in thread "no-op thread" org.openj9.test.java.lang.Test_ThreadGroup$1UncaughtException
at org.openj9.test.java.lang.Test_ThreadGroup$13.uncaughtException(Test_ThreadGroup.java:972) from jdk.internal.loader.ClassLoaders$AppClassLoader@c3523170(file:/root/home/ahuo/src/jvmtest/functional/Java8andUp/GeneralTest.jar)
at java.base/java.lang.Thread.uncaughtException(Thread.java:1363) from jrt:/java.base
org.openj9.test.java.lang.Test_Throwable$TestException: test
PASSED: dummyReflectHelper
PASSED: testDefaultMethodInheritance
...
...
TEST TARGETS SUMMARY
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PASSED test targets:
JCL_Test_none_SCC_1
TOTAL: 1 EXECUTED: 1 PASSED: 1 FAILED: 0 DISABLED: 0 SKIPPED: 0
ALL TESTS PASSED
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[2]
TREE VERIFICATION ERROR -- node [ 0x3fff53c6c8f0] ref count is 9 and should be 8
n12982n iRegStore gr30 [ 0x3fff53c6c990] bci=[155,64,702] rc=0 vc=3484 vn=- li=-2 udi=126 nc=1
n12980n su2i (X>=0 ) [ 0x3fff53c6c8f0] bci=[155,61,702] rc=9 vc=3484 vn=- li=-7 udi=- nc=1 flg=0x100
...
n38744n PassThrough gr30 [ 0x3fff028e3e30] bci=[155,61,702] rc=1 vc=3484 vn=- li=-1 udi=- nc=1
n12980n ==>su2i
...
n13192n ificmplt --> block_2882 BBStart at n35729n () [ 0x3fff53d00b40] bci=[156,3,8178] rc=0 vc=3484 vn=- li=-2 udi=- nc=3 flg=0x20
n12980n ==>su2i
...
n38746n PassThrough gr30 [ 0x3fff028e3ed0] bci=[155,61,702] rc=1 vc=3484 vn=- li=-1 udi=- nc=1
n12980n ==>su2i
...
n32992n call unknown[#384 helper Method] [flags 0x400 0x0 ] [ 0x3fff01a63840] bci=[156,6,8178] rc=1 vc=3484 vn=- li=-1 udi=- nc=2
n12980n ==>su2i
...
n50750n treetop [ 0x3fff0390e6f0] bci=[155,61,702] rc=0 vc=0 vn=- li=- udi=- nc=1
n12980n ==>su2i
n13197n ificmpge --> block_1296 BBStart at n15584n () [ 0x3fff53d00cd0] bci=[156,9,8178] rc=0 vc=3484 vn=- li=-2 udi=- nc=3 flg=0x20
n12980n ==>su2i
...
n38750n istore <auto slot 10>[#1467 Auto] [flags 0x3 0x0 ] [ 0x3fff028e4010] bci=[155,61,702] rc=0 vc=3484 vn=- li=-2 udi=- nc=1
n12980n ==>su2i
After disabling the other two other asserts, I was able to get a full jitdump log for the first assert (There are also other errors from the IL trees, but for now I'm focusing on this first assert).
Assertion failed at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRSymbol_inlines.hpp:53: self()->isAuto()
VMState: 0x000563ff
OMR::Symbol::castToAutoSymbo, symbol is not an automatic symbol: this 0x735af83f52e0
compiling java/lang/StringUTF16.toLowerCase(Ljava/lang/String;[BLjava/util/Locale;)Ljava/lang/String; at level: very-hot (profiling)
The backtrace [1] shows it happened in createSymRefForNode triggered by OMR::Block::splitPostGRA
during jProfilingValue
. In this particular case, valueChild
is aRegLoad
(n37706n
) and valueChildSymRef
is a parm
symbol instead of auto
[2].
if (valueChildSymRef != NULL &&
((valueChild->getOpCode().isLoadVarDirect() && valueChildSymRef->getSymbol()->isAuto()) ||
(valueChild->getOpCode().isLoadReg() && valueChildSymRef->getSymbol()->castToAutoSymbol()->isInternalPointer())))
@r30shah I wonder if the intended condition check here should have always checked valueChildSymRef->getSymbol()->isAuto()
as below:
if (valueChildSymRef != NULL &&
valueChildSymRef->getSymbol()->isAuto() &&
(valueChild->getOpCode().isLoadVarDirect() ||
(valueChild->getOpCode().isLoadReg() && valueChildSymRef->getSymbol()->castToAutoSymbol()->isInternalPointer())))
[1]
#13 <signal handler called>
#14 0x0000735b2276d168 in raise () from /lib/powerpc64le-linux-gnu/libc.so.6
#15 0x0000735b20e39250 in TR::trap () at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/infra/Assert.cpp:67
#16 0x0000735b20e3940c in TR::va_fatal_assertion (ap=0x735b00bc2ac0 "\340R?\370Zs",
format=0x735b21415c98 "OMR::Symbol::castToAutoSymbo, symbol is not an automatic symbol: this %p",
condition=0x735b21415ce8 "self()->isAuto()", line=53,
file=0x735b214156f0 "/root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRSymbol_inlines.hpp")
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/infra/Assert.cpp:139
#17 TR::assertion (file=0x735b214156f0 "/root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRSymbol_inlines.hpp",
line=<optimized out>, condition=0x735b21415ce8 "self()->isAuto()",
format=0x735b21415c98 "OMR::Symbol::castToAutoSymbo, symbol is not an automatic symbol: this %p")
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/infra/Assert.cpp:156
#18 0x0000735b20defc90 in OMR::Symbol::castToAutoSymbol (this=0x735af83f52e0)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRSymbol_inlines.hpp:53
#19 createSymRefForNode (comp=comp@entry=0x735af83f0000, methodSymbol=methodSymbol@entry=0x735af83f4ef0,
value=value@entry=0x735af905fe60, insertBefore=0x735aa3841950)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRBlock.cpp:1183
#20 0x0000735b20dfd040 in OMR::Block::splitPostGRA (this=<optimized out>, startOfNewBlock=<optimized out>, cfg=<optimized out>,
copyExceptionSuccessors=<optimized out>, methodSymbol=<optimized out>)
--Type <RET> for more, q to quit, c to continue without paging--
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRBlock.hpp:272
#21 0x0000735b20b6b0c4 in TR_JProfilingValue::addProfilingTrees (comp=0x735af83f0000, insertionPoint=0x735aa2dedf60,
value=0x735af905c710, table=0x735afb31f060, addNullCheck=<optimized out>, extendBlocks=<optimized out>, trace=<optimized out>)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRTreeTop_inlines.hpp:45
#22 0x0000735b20b6c52c in TR_JProfilingValue::lowerCalls (this=0x735aa3635f00)
at /root/home/ahuo/src/openj9-openjdk-jdk11/openj9/runtime/compiler/optimizer/JProfilingValue.cpp:364
#23 0x0000735b20b6c998 in TR_JProfilingValue::perform (this=0x735aa3635f00)
at /root/home/ahuo/src/openj9-openjdk-jdk11/openj9/runtime/compiler/optimizer/JProfilingValue.cpp:187
#24 0x0000735b2108bb5c in OMR::Optimizer::performOptimization (this=0x735af84b0b10, optimization=<optimized out>,
firstOptIndex=<optimized out>, lastOptIndex=<optimized out>, doTiming=<optimized out>)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:2064
#25 0x0000735b2108c22c in OMR::Optimizer::performOptimization (this=0x735af84b0b10, optimization=<optimized out>,
firstOptIndex=<optimized out>, lastOptIndex=<optimized out>, doTiming=<optimized out>)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:1611
#26 0x0000735b2108db14 in OMR::Optimizer::optimize (this=0x735af84b0b10)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/optimizer/OMROptimizer.cpp:1128
-Xjit:vmState=0x000563ff
vmState [0x563ff]: {J9VMSTATE_JIT} {jProfilingValue}
[2]
#19 createSymRefForNode (comp=comp@entry=0x735af83f0000, methodSymbol=methodSymbol@entry=0x735af83f4ef0,
value=value@entry=0x735af905fe60, insertBefore=0x735aa3841950)
at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRBlock.cpp:1183
1183 (valueChild->getOpCode().isLoadReg() && valueChildSymRef->getSymbol()->castToAutoSymbol()->isInternalPointer())))
(gdb) p valueChild
$4 = (TR::Node *) 0x735aa283f9b0
(gdb) p *valueChildSymRef
$6 = {<J9::SymbolReference> = {<OMR::SymbolReference> = {
_vptr.SymbolReference = 0x735b216cbb48 <vtable for TR::SymbolReference+16>, _symbol = 0x735af83f52e0, _extraInfo = 0x0,
_offset = 0, _size = 4164899568, _owningMethodIndex = {_value = 0}, _cpIndex = 1, _unresolvedIndex = 0,
_referenceNumber = 401, _flags = {_flags = 536870912}, _knownObjectIndex = -1, {_useDefAliases = 0x0,
_independentSymRefs = 0x0}}, <No data fields>}, <No data fields>}
(gdb) p *(valueChildSymRef->_symbol)
$7 = {<J9::Symbol> = {<OMR::Symbol> = {_vptr.Symbol = 0x735b216e3ce0 <vtable for TR::ParameterSymbol+16>, _size = 8, _name = 0x0,
_declaredClass = 0x0, _flags = {_flags = 1073742087}, _flags2 = {_flags = 0}, _localIndex = 229},
_recognizedField = 4294967295}, <No data fields>}
n37706n aRegLoad gr18 <parm 1 [B>[#401 Parm] [flags 0x40000107 0x0 ] (X!=0 SeenRealReference ) [ 0x735aa283f9b0] bci=[155,58,702] rc=11 vc=3343 vn=- li=-11 udi=- nc=0 flg=0x8004
n12976n iRegStore gr30 [ 0x735af905c7b0] bci=[155,64,702] rc=0 vc=3343 vn=- li=-2 udi=95 nc=1
n12974n su2i (X>=0 ) [ 0x735af905c710] bci=[155,61,702] rc=9 vc=3343 vn=- li=-7 udi=- nc=1 flg=0x100
n13152n sloadi <array-shadow>[#248 Shadow] [flags 0x80000601 0x0 ] (cannotOverflow ) [ 0x735af905feb0] bci=[155,61,702] rc=1 vc=3343 vn=- li=-1 udi=- nc=1 flg=0x1000
n13151n aladd (X>=0 internalPtr ) [ 0x735af905fe60] bci=[155,58,702] rc=2 vc=3343 vn=- li=-2 udi=- nc=2 flg=0x8100
n37706n ==>aRegLoad
I run JCL_Test_none_SCC_1
x150 with the proposed fix. I no longer reproduce the following two asserts in jProfilingValue
.
1. Assertion failed at /root/home/ahuo/src/openj9-openjdk-jdk11/omr/compiler/il/OMRSymbol_inlines.hpp:53: self()->isAuto()
VMState: 0x000563ff
OMR::Symbol::castToAutoSymbo, symbol is not an automatic symbol: this 0x735af83f52e0
2. Assertion failed at /home/jenkins/workspace/Build_JDK11_ppc64le_linux_Personal/omr/compiler/ras/Tree.cpp:2398: debug("fixTrees")
VMState: 0x000563ff
Tree verification error
I'd like to point out assert 2, IL verification error
, is related to assert 1. 0x735aa283f9b0
is the valueChild
. Its symref valueChildSymRef
is the one that casts parm
symbol to auto
.
TREE VERIFICATION ERROR -- node [ 0x735af905c710] ref count is 9 and should be 8
BLOCK VERIFICATION ERROR -- node [ 0x735aa283f9b0] accessed outside of its (extended) basic block: 7 time(s)
</ilOfCrashedThread>
n37706n aRegLoad gr18 <parm 1 [B>[#401 Parm] [flags 0x40000107 0x0 ] (X!=0 SeenRealReference ) [ 0x735aa283f9b0] bci=[155,58,702] rc=11 vc=3343 vn=- li=-11 udi=- nc=0 flg=0x8004
n12976n iRegStore gr30 [ 0x735af905c7b0] bci=[155,64,702] rc=0 vc=3343 vn=- li=-2 udi=95 nc=1
n12974n su2i (X>=0 ) [ 0x735af905c710] bci=[155,61,702] rc=9 vc=3343 vn=- li=-7 udi=- nc=1 flg=0x100
n13152n sloadi <array-shadow>[#248 Shadow] [flags 0x80000601 0x0 ] (cannotOverflow ) [ 0x735af905feb0] bci=[155,61,702] rc=1 vc=3343 vn=- li=-1 udi=- nc=1 flg=0x1000
n13151n aladd (X>=0 internalPtr ) [ 0x735af905fe60] bci=[155,58,702] rc=2 vc=3343 vn=- li=-2 udi=- nc=2 flg=0x8100
n37706n ==>aRegLoad
With the proposed fix, I see intermittent assert in idiomRecognition
which is reported in #17819, which I don't think related to these two asserts from jProfilingValue
.
Assertion failed at /root/home/ahuo/src/openj9-openjdk-jdk11/openj9/runtime/compiler/optimizer/IdiomRecognition.cpp:6692: false
VMState: 0x000544ff
not implemented yet
compiling java/lang/AbstractStringBuilder.append([C)Ljava/lang/AbstractStringBuilder; at level: hot
Hi @a7ehuo , do you have a JITDUMP when you hit the Assert 1 mentioned in https://github.com/eclipse-openj9/openj9/issues/17817#issuecomment-1663957089 and also JITDUMP when Assert 2 mentioned in the same commit is hit ? I wonder those two are related. Though looking into the intention of that condition, what you recommended should be the right check.
do you have a JITDUMP when you hit the Assert 1 mentioned in https://github.com/eclipse-openj9/openj9/issues/17817#issuecomment-1663957089 and also JITDUMP when Assert 2 mentioned in the same commit is hit
Here is the jitdump: jitdump.20230801.091548.19259.0004.dmp.zip
The jitdump recompilation didn't repro the assert. The IL trees from the crash (ilOfCrashedThread
) shows the issue from assert 1 and assert 2.
0x735af905c710
is n12974n su2i
0x735aa283f9b0
is n37706n aRegLoad gr18 <parm 1 [B>[#401 Parm]
<== valueChild
TREE VERIFICATION ERROR -- node [ 0x735af905c710] ref count is 9 and should be 8
BLOCK VERIFICATION ERROR -- node [ 0x735aa283f9b0] accessed outside of its (extended) basic block: 7 time(s)
</ilOfCrashedThread>
n12969n BBStart <block_1219> (freq 791) [ 0x735af905c580] bci=[156,0,8178] rc=0 vc=3343 vn=- li=-2 udi=- nc=1
n37697n GlRegDeps () [ 0x735aa283f6e0] bci=[156,0,8178] rc=1 vc=3343 vn=- li=-1 udi=- nc=9 flg=0x20
n37698n aRegLoad gr5 <parm 2 Ljava/util/Locale;>[#402 Parm] [flags 0x40000107 0x0 ] [ 0x735aa283f730] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37699n aRegLoad gr3 <parm 0 Ljava/lang/String;>[#400 Parm] [flags 0x40000107 0x0 ] [ 0x735aa283f780] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37700n iRegLoad gr26 <auto slot 6>[#1459 Auto] [flags 0x3 0x0 ] [ 0x735aa283f7d0] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37701n iRegLoad gr24 <auto slot 9>[#1461 Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow SeenRealReference ) [ 0x735aa283f820] bci=[155,59,702] rc=12 vc=3343 vn=- li=-12 udi=- nc=0 flg=0x9100
n37702n aRegLoad gr23 <temp slot 17>[#1238 Auto] [flags 0x20000007 0x0 ] [ 0x735aa283f870] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37703n iRegLoad gr22 <temp slot 113>[#2381 Auto] [flags 0x3 0x0 ] [ 0x735aa283f8c0] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37704n iRegLoad gr21 <auto slot 7>[#1460 Auto] [flags 0x3 0x0 ] [ 0x735aa283f910] bci=[156,0,8178] rc=10 vc=3343 vn=- li=-10 udi=- nc=0
n37705n iRegLoad gr19 <temp slot 107>[#2375 Auto] [flags 0x3 0x0 ] (X!=0 X>=0 cannotOverflow SeenRealReference ) [ 0x735aa283f960] bci=[156,1,8178] rc=12 vc=3343 vn=- li=-12 udi=- nc=0 flg=0x49104
n37706n aRegLoad gr18 <parm 1 [B>[#401 Parm] [flags 0x40000107 0x0 ] (X!=0 SeenRealReference ) [ 0x735aa283f9b0] bci=[155,58,702] rc=11 vc=3343 vn=- li=-11 udi=- nc=0 flg=0x8004
n12976n iRegStore gr30 [ 0x735af905c7b0] bci=[155,64,702] rc=0 vc=3343 vn=- li=-2 udi=95 nc=1
n12974n su2i (X>=0 ) [ 0x735af905c710] bci=[155,61,702] rc=9 vc=3343 vn=- li=-7 udi=- nc=1 flg=0x100
n13152n sloadi <array-shadow>[#248 Shadow] [flags 0x80000601 0x0 ] (cannotOverflow ) [ 0x735af905feb0] bci=[155,61,702] rc=1 vc=3343 vn=- li=-1 udi=- nc=1 flg=0x1000
n13151n aladd (X>=0 internalPtr ) [ 0x735af905fe60] bci=[155,58,702] rc=2 vc=3343 vn=- li=-2 udi=- nc=2 flg=0x8100
n37706n ==>aRegLoad
With https://github.com/eclipse/omr/pull/7087, I haven't been able to reproduce the second and the third asserts.
paranoidOptCheck
option on java/lang/StringUTF16.toLowerCase(Ljava/lang/String;[BLjava/util/Locale;)Ljava/lang/String;
The first assertion in OMRSymbol_inlines.hpp:53
should now be fixed in https://github.com/eclipse/omr/pull/7087
Assertion failed at /home/jenkins/workspace/Build_JDK11_ppc64le_linux_Personal/omr/compiler/il/OMRSymbol_inlines.hpp:53: self()->isAuto()
VMState: 0x000563ff
OMR::Symbol::castToAutoSymbo, symbol is not an automatic symbol
Moving investigation of the remaining assertion failures to 0.43.
The assertion at
/home/jenkins/workspace/Build_JDK11_ppc64le_linux_Personal/omr/compiler/il/OMRSymbol_inlines.hpp:53: self()->isAuto()
fails in
sanity.functional
test casesJCL_Test_none_SCC_1
andJCL_Test_none_SCC_0
onppc64le_linux
for Java 11.Link to the Jenkins job.
Stack trace: