dragonwell-project / dragonwell11

Alibaba Dragonwell11 JDK
https://www.aliyun.com/product/dragonwell
GNU General Public License v2.0
552 stars 111 forks source link

[GC] Fix & optimize arrayEquals on AArch64 (#743) #807

Closed weixlu closed 3 months ago

weixlu commented 3 months ago

[GC] Fix & optimize arrayEquals on AArch64 (#743)

Summary: fix tail comparison and simplify flow control for arrayEquals

Testing: jdk/javadoc/tool/parser/7091528/T7091528.java

Reviewers: mmyxym, linade

Issue: https://github.com/dragonwell-project/dragonwell11/issues/743

weixlu commented 3 months ago

About correctness This is to fix https://github.com/dragonwell-project/dragonwell11/issues/743

export test=test/langtools/jdk/javadoc/tool/parser/7091528/T7091528.java
function runJtreg() { jtreg -Xcomp -XX:+UseCompactObjectHeaders -ea -esa -timeoutFactor:4 -v:fail,error,time,nopass -nr -w $dir/index-$1 $test &> $dir/$1.log ; if [[ 0 -ne $? ]] ; then echo -n "$1 " ; else rm -rf $dir/index-$1 $dir/$1.log ; fi ; } ; export -f runJtreg ; export dir="tmp-jtreg-"`basename ${test##* } .java` ; rm -rf $dir ; mkdir -p $dir ; time seq 100 | xargs -i -n 1 -P `nproc` bash -c "runJtreg {}" ; echo total fail number: `ls $dir/*.log 2> /dev/null | wc | awk '{print $1}'`

with our fix, no failure is observed for five rounds of such test run.

weixlu commented 3 months ago
About performance TestCase Short Medium Long
testByteFalseBeginning 0.36% 17.58% -4.66%
testByteFalseEnd -24.83% -6.15% 27.69%
testByteFalseMid 1.28% 9.20% 11.03%
testByteTrue -29.56% -7.45% 25.98%
testCharFalseBeginning 21.01% -0.74% -3.10%
testCharFalseEnd -2.28% 30.00% 28.07%
testCharFalseMid 17.06% 13.15% 17.30%
testCharTrue -0.18% 27.48% 28.67%

8.2% improvement on average.

There seems to be regression for short arrays. This is because the current implementation blindly loads a whole word, even if array doesn't fully occupy it. This is better for performance but introduces potential crash together with lilliput.

weixlu commented 3 months ago
@mmyxym Reviewer proposed to replace the tail check to the end, and we have tested its performance. It seems the location of code doesn't have a significant impact on performance TestCase Short Medium Long
testByteFalseBeginning 0.69% 1.04% -1.74%
testByteFalseEnd 0.20% -0.45% 0.67%
testByteFalseMid 1.81% 0.36% 0.41%
testByteTrue -2.45% -0.48% 0.14%
testCharFalseBeginning 0.02% -2.05% -2.38%
testCharFalseEnd 3.41% -0.14% -0.22%
testCharFalseMid 0.91% -3.18% 0.78%
testCharTrue -2.62% 0.60% -0.51%

on avg: -0.21%