eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

jdk_nio_0_FAILED sun/nio/cs/TestStringCoding.java RuntimeException: getBytes(csn) failed -> x-IBM29626C #18974

Closed JasonFengJ9 closed 4 months ago

JasonFengJ9 commented 7 months ago

Failure link

From an internal build(sles12x86-rtp-rt6-1):

22:15:22  openjdk version "22-internal" 2024-03-19
22:15:22  OpenJDK Runtime Environment (build 22-internal-adhoc.jenkins.BuildJDK22x86-64linuxPersonal)
22:15:22  Eclipse OpenJ9 VM (build master-f524f995be4, JRE 22 Linux amd64-64-Bit Compressed References 20240215_57 (JIT enabled, AOT enabled)
22:15:22  OpenJ9   - f524f995be4
22:15:22  OMR      - 0966eb360f6
22:15:22  JCL      - ff0372607f0 based on jdk-22+35)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

22:46:03  ===============================================
22:46:03  Running test jdk_nio_0 ...
22:46:03  ===============================================
22:46:03  jdk_nio_0 Start Time: Thu Feb 15 19:46:03 2024 Epoch Time (ms): 1708055163173
22:46:03  variation: Mode150
22:46:03  JVM_OPTIONS:  -XX:+UseCompressedOops -Xverbosegclog 

22:54:38  TEST: sun/nio/cs/TestStringCoding.java

22:54:38  STDERR:
22:54:38  java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C
22:54:38    at TestStringCoding.testGetBytes(TestStringCoding.java:164)
22:54:38    at TestStringCoding.main(TestStringCoding.java:64)
22:54:38    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
22:54:38    at java.base/java.lang.reflect.Method.invoke(Method.java:586)
22:54:38    at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
22:54:38    at java.base/java.lang.Thread.run(Thread.java:1582)
22:54:38  
22:54:38  JavaTest Message: Test threw exception: java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C

22:54:52  jdk_nio_0_FAILED

50x internal grinder

Also seen on JDK22 x86-64_mac

pshipton commented 6 months ago

cent8x86-svl-rt5-1, rhel8x86-svl-rt6-1 succeeded.

Failures:

rhel8x86-svl-rt7-1 1/20 sun/nio/cs/TestStringCoding.java

sles15x86-svl-rt6-1 20 failures (100%) of java/nio/file/Files/CopyMoveVariations.java 2/20 sun/nio/cs/TestStringCoding.java

ubu22x86-svl-rt4-1 3/20 sun/nio/cs/TestStringCoding.java

JasonFengJ9 commented 6 months ago

Occurred at JDK17 s390x_linux(rhel8s390x-svl-rt1-1) as well.

[2024-02-24T17:06:18.040Z] variation: Mode150
[2024-02-24T17:06:18.040Z] JVM_OPTIONS:  -XX:+UseCompressedOops -Xverbosegclog 

[2024-02-24T17:18:13.495Z] TEST: sun/nio/cs/TestStringCoding.java

[2024-02-24T17:18:13.496Z] STDERR:
[2024-02-24T17:18:13.496Z] java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C
[2024-02-24T17:18:13.496Z]  at TestStringCoding.testGetBytes(TestStringCoding.java:164)
[2024-02-24T17:18:13.496Z]  at TestStringCoding.main(TestStringCoding.java:64)
[2024-02-24T17:18:13.496Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2024-02-24T17:18:13.496Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
[2024-02-24T17:18:13.496Z]  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2024-02-24T17:18:13.496Z]  at java.base/java.lang.reflect.Method.invoke(Method.java:574)
[2024-02-24T17:18:13.496Z]  at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
[2024-02-24T17:18:13.496Z]  at java.base/java.lang.Thread.run(Thread.java:857)
[2024-02-24T17:18:13.496Z] 
[2024-02-24T17:18:13.496Z] JavaTest Message: Test threw exception: java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C

[2024-02-24T17:18:21.873Z] jdk_nio_0_FAILED

JDK21 aarch64_linux(rhel9-aarch64-8)

pshipton commented 6 months ago

The jdk17 failure is repeatable when running just the single test, it fails 100%. https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38154

Weekly history 610 - failed 605 - didn't run 596 - builds failed (no testing) due to infra 589 - passed

609 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38158/ - failed 608 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38160/ - failed 607 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38161/ - failed 606 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38162/ - crash 605 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38159/ - crash https://github.com/eclipse-openj9/openj9/issues/18979 604 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38163/ - failed 602 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38168/ - failed 601 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38167/ - failed 600 https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38165/ - failed 597 (c92d58374, 410ef0d23) https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38166/ - failed 596 (8dc58cc37, cbf3bf87c) https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38164/ - passing

https://github.com/eclipse-openj9/openj9/compare/8dc58cc37...c92d58374 https://github.com/eclipse-openj9/openj9-omr/compare/cbf3bf87c...410ef0d23

@hzongaro fyi, there are possibly two problems in this issue, although maybe related. The most recent is a 100% failure on Z with a short list of possibilities in the previous compares. I'll narrow it down.

https://github.com/eclipse/omr/pull/7222 https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-Personal/495/ https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38172/ - passing

https://github.com/eclipse/omr/pull/7229 https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-Personal/496/ https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38171/ - passing

https://github.com/eclipse-openj9/openj9/pull/18819 https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-Personal/497/ https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38173/ - failing

pshipton commented 6 months ago

@knn-k @Akira1Saitoh pls take a look. Note this is a blocker for the upcoming release.

Akira1Saitoh commented 6 months ago

In #18819, transformIndexOfKnownString was enabled for StringLatin1.indexOfChar. transformIndexOfKnownString seems to be incorrectly transforming the trees.

In [sun/nio/cs/ext/IBM29626C$Encoder.encodeSingle(C[B)I](https://github.com/ibmruntimes/openj9-openjdk-jdk17/blob/c161fa2bfd2f8d8b1dc083f70fc4e7090cc12193/src/jdk.charsets/share/classes/sun/nio/cs/ext/IBM29626C.java.template#L424), String.indexOf is called against the String literal "\u00A2\u00A3\u00AC". However, this call is transformed into the following tree:

   6256 [   144] O^O VALUE PROPAGATION: Replacing indexOf call node [0000FFFF88822A60] on known string receiver with equivalent iselect tree
   6257 [   145] O^O VALUE PROPAGATION: Replacing indexOf call node [0000FFFF88822A60] on known string receiver with equivalent iselect tree
   6258 [   146] O^O VALUE PROPAGATION: Replacing indexOf call node [0000FFFF88822A60] on known string receiver with equivalent iselect tree
...
   6530 n547n     treetop                                                                             [    0xffff88822ab0] bci=[4,32,213] rc=0 vc=137 vn=- li=- udi=- nc=1
   6531 n546n       PassThrough                                                                       [    0xffff88822a60] bci=[4,32,213] rc=3 vc=137 vn=- li=- udi=- nc=1
   6532 n790n         iselect ()                                                                      [    0xffff888276a0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
   6533 n788n           icmpeq                                                                        [    0xffff88827600] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=2
   6534 n543n             ==>iload
   6535 n787n             iconst -94 (X!=0 X<=0 )                                                     [    0xffff888275b0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204
   6536 n789n           iconst 0 (X==0 X>=0 X<=0 )                                                    [    0xffff88827650] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x302
   6537 n786n           iselect ()                                                                    [    0xffff88827560] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
   6538 n784n             icmpeq                                                                      [    0xffff888274c0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=2
   6539 n543n               ==>iload
   6540 n783n               iconst -93 (X!=0 X<=0 )                                                   [    0xffff88827470] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204
   6541 n785n             iconst 1 (X!=0 X>=0 )                                                       [    0xffff88827510] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
   6542 n782n             iselect ()                                                                  [    0xffff88827420] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
   6543 n780n               icmpeq                                                                    [    0xffff88827380] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=2
   6544 n543n                 ==>iload
   6545 n779n                 iconst -84 (X!=0 X<=0 )                                                 [    0xffff88827330] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204
   6546 n781n               iconst 2 (X!=0 X>=0 )                                                     [    0xffff888273d0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
   6547 n778n               iconst -1 (X!=0 X<=0 )                                                    [    0xffff888272e0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204

The tree compares the input character with -94, -93, and -84, but those values should be 0xa2, 0xa3, and 0xac.

Akira1Saitoh commented 6 months ago

Opened https://github.com/eclipse-openj9/openj9/pull/19044.

Akira1Saitoh commented 6 months ago

Running sun/nio/cs/TestStringCoding.java test with the VM including the fix (#19044). https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/38187/ - passed (25/25)

Akira1Saitoh commented 6 months ago

Opened https://github.com/eclipse-openj9/openj9/pull/19056 for 0.44.

JasonFengJ9 commented 4 months ago

JDK11 aarch64_mac RC1 build(macaarch64rt2)

java version "11.0.23" 2024-04-16
IBM Semeru Runtime Certified Edition 11.0.23.0-rc1 (build 11.0.23+9)
Eclipse OpenJ9 VM 11.0.23.0-rc1 (build v0.44.0-release-8d018d6d4, JRE 11 Mac OS X aarch64-64-Bit 20240418_538 (JIT enabled, AOT enabled)
OpenJ9   - 8d018d6d4
OMR      - 254af5a04
JCL      - 41eb0e0ab2 based on jdk-11.0.23+9)

[2024-04-19T19:09:54.924Z] variation: Mode650
[2024-04-19T19:09:54.924Z] JVM_OPTIONS:  -XX:-UseCompressedOops -Xverbosegclog 

[2024-04-19T19:13:39.250Z] TEST: sun/nio/cs/TestStringCoding.java

[2024-04-19T19:13:39.250Z] STDERR:
[2024-04-19T19:13:39.250Z] java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C
[2024-04-19T19:13:39.250Z]  at TestStringCoding.testGetBytes(TestStringCoding.java:164)
[2024-04-19T19:13:39.250Z]  at TestStringCoding.main(TestStringCoding.java:64)
[2024-04-19T19:13:39.250Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2024-04-19T19:13:39.250Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2024-04-19T19:13:39.250Z]  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2024-04-19T19:13:39.250Z]  at java.base/java.lang.reflect.Method.invoke(Method.java:572)
[2024-04-19T19:13:39.250Z]  at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2024-04-19T19:13:39.250Z]  at java.base/java.lang.Thread.run(Thread.java:839)
[2024-04-19T19:13:39.250Z] 
[2024-04-19T19:13:39.250Z] JavaTest Message: Test threw exception: java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C

[2024-04-19T19:13:39.251Z] TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: getBytes(csn) failed -> x-IBM29626C
[2024-04-19T19:13:39.251Z] --------------------------------------------------
[2024-04-19T19:14:13.528Z] Test results: passed: 383; failed: 1
[2024-04-19T19:14:18.620Z] Report written to /Users/jenkins/workspace/Test_openjdk11_j9_extended.openjdk_aarch64_mac_testList_2/aqa-tests/TKG/output_17135536123179/jdk_nio_1/report/html/report.html
[2024-04-19T19:14:18.620Z] Results written to /Users/jenkins/workspace/Test_openjdk11_j9_extended.openjdk_aarch64_mac_testList_2/aqa-tests/TKG/output_17135536123179/jdk_nio_1/work
[2024-04-19T19:14:18.620Z] Error: Some tests failed or other problems occurred.
[2024-04-19T19:14:18.620Z] -----------------------------------
[2024-04-19T19:14:18.620Z] jdk_nio_1_FAILED

50x grinder - all failed

JDK11 ppc64le_linux RC1 JDK11 s390x_linux RC1 JDK11 x86-64_linux RC1 JDK11 x86-64_mac JDK11 x86-64_windows JDK11 ppc64_aix

pshipton commented 4 months ago

@hzongaro I can repeat the failure, on xlinux, using the following cut down testcase. It wasn't occurring for me with the default options, but occurred when using a -Xmx less than 4g (512m to 3g). Then I added -Xshareclasses:none and it occurs without setting -Xmx (or with any -Xmx value I tried). My default is -Xmx4052032K. It doesn't occur with -Xint.

import java.util.*;
import java.nio.*;
import java.nio.charset.*;

public class Coding {
    public static void main(String[] args) throws Throwable {

        // full bmp first
        char[] bmp = new char[0x10000];
        for (int i = 0; i < 0x10000; i++) {
            bmp[i] = (char)i;
        }

        Charset cs = Charset.forName("x-IBM29626C");
        testNewString(cs, testGetBytes(cs, new String(bmp)));
    }

    static byte[] getBytes(CharsetEncoder enc, String str) throws Throwable {
        ByteBuffer bf = enc.reset().encode(CharBuffer.wrap(str.toCharArray()));
        byte[] ba = new byte[bf.limit()];
        bf.get(ba, 0, ba.length);
        return ba;
    }

    static byte[] testGetBytes(Charset cs, String str) throws Throwable {
        CharsetEncoder enc = cs.newEncoder()
            .onMalformedInput(CodingErrorAction.REPLACE)
            .onUnmappableCharacter(CodingErrorAction.REPLACE);
        //getBytes(csn);
        byte[] baSC = str.getBytes(cs.name());
        byte[] baNIO = getBytes(enc, str);
        if (!Arrays.equals(baSC, baNIO)) {
            throw new RuntimeException("getBytes(csn) failed  -> " + cs.name());
        }
        //getBytes(cs);
        baSC = str.getBytes(cs);
        if (!Arrays.equals(baSC, baNIO)) {
            throw new RuntimeException("getBytes(cs) failed  -> " + cs.name());
        }
        return baSC;
    }

    static String testNewString(Charset cs, byte[] ba) throws Throwable {
        CharsetDecoder dec = cs.newDecoder()
            .onMalformedInput(CodingErrorAction.REPLACE)
            .onUnmappableCharacter(CodingErrorAction.REPLACE);
        //new String(csn);
        String strSC = new String(ba, cs.name());
        String strNIO = dec.reset().decode(ByteBuffer.wrap(ba)).toString();
        if(!strNIO.equals(strSC)) {
            throw new RuntimeException("new String(csn) failed  -> " + cs.name());
        }
        return strSC;
    }
}
pshipton commented 4 months ago

I suspect it's partly related to the compressedRefsShift, as with -Xmx values 512m - 3g I'm getting 0, and with -Xmx4g I'm getting 3.

pshipton commented 4 months ago

The problem didn't/doesn't occur with M2 builds. Besides the RC1 build it also occurs on the OpenJ9 open source build. https://openj9-jenkins.osuosl.org/job/Build_JDK11_x86-64_linux_Release/124/

~OpenJ9/OMR didn't change between the builds.~ The OpenJDK changes don't look like they would cause the failure.

Changes: https://github.com/ibmruntimes/openj9-openjdk-jdk11/compare/0f9c69e64d...a5dbe77

IBM Semeru Runtime Open Edition 11.0.23.0-m2 (build 11.0.23+7)
Eclipse OpenJ9 VM 11.0.23.0-m2 (build v0.44.0-release-8b079b5c1, JRE 11 Linux amd64-64-Bit Compressed References 20240411_1019 (JIT enabled, AOT enabled)
OpenJ9   - 8b079b5c1
OMR      - 254af5a04
JCL      - 0f9c69e64d based on jdk-11.0.23+7)

OpenJDK Runtime Environment (build 11.0.23-internal+0-adhoc..BuildJDK11x86-64linuxRelease)
Eclipse OpenJ9 VM (build v0.44.0-release-8d018d6, JRE 11 Linux amd64-64-Bit Compressed References 20240420_124 (JIT enabled, AOT enabled)
OpenJ9   - 8d018d6
OMR      - 254af5a
JCL      - a5dbe77 based on jdk-11.0.23+9)
pshipton commented 4 months ago

My bad, on second look, OpenJ9 did change between the builds, with a change to create LATIN1 string constants correctly as compact Strings. https://github.com/eclipse-openj9/openj9/compare/8b079b5c1...8d018d6 https://github.com/eclipse-openj9/openj9/pull/19348

hzongaro commented 4 months ago

It looks like Value Propagation is performing a transforming a JITHelpers.intrinsicIndexOfLatin1 of a known string value — "\u00A2\u00A3\u00AC" — incorrectly. Before Global Value Propagation:

n464n       icall  com/ibm/jit/JITHelpers.intrinsicIndexOfLatin1(Ljava/lang/Object;BII)I[#474  final virtual Method -552] [flags 0x20500 0x0 ] ()  [0x7fe5f66aa0c0] bci=[2,68,1998] rc=2 vc=99 vn=- li=- udi=- nc=5 flg=0x20
n457n         aload  java/lang/String.helpers Lcom/ibm/jit/JITHelpers;[#462  final Static] (obj2) [flags 0xa0307 0x0 ]  [0x7fe5f66a9e90] bci=[2,59,1998] rc=1 vc=99 vn=- li=- udi=- nc=0
n458n         aload  array<auto slot 4>[#457  Auto] [flags 0x7 0x0 ]                          [0x7fe5f66a9ee0] bci=[2,62,1998] rc=1 vc=99 vn=- li=- udi=- nc=0
n461n         b2i                                                                             [0x7fe5f66a9fd0] bci=[2,65,1998] rc=1 vc=99 vn=- li=- udi=- nc=1
n460n           i2b                                                                           [0x7fe5f66a9f80] bci=[2,65,1998] rc=1 vc=99 vn=- li=- udi=- nc=1
n459n             iload  inputChar<parm 1 C>[#423  Parm] [flags 0x40000103 0x0 ]              [0x7fe5f66a9f30] bci=[2,64,1998] rc=1 vc=99 vn=- li=- udi=- nc=0
n462n         iload  <temp slot 7>[#480  Auto] [flags 0x20000003 0x0 ]                        [0x7fe5f66aa020] bci=[2,66,1998] rc=1 vc=99 vn=- li=- udi=- nc=0
n463n         iload  len<auto slot 3>[#455  Auto] [flags 0x3 0x0 ]                            [0x7fe5f66aa070] bci=[2,67,1998] rc=1 vc=99 vn=- li=- udi=- nc=0

after:

n653n     treetop                                                                             [0x7fe5f66adbd0] bci=[2,65,1998] rc=0 vc=0 vn=- li=- udi=- nc=1
n461n       b2i                                                                               [0x7fe5f66a9fd0] bci=[2,65,1998] rc=4 vc=111 vn=- li=- udi=- nc=1
n460n         i2b                                                                             [0x7fe5f66a9f80] bci=[2,65,1998] rc=1 vc=111 vn=- li=- udi=- nc=1
n459n           iload  inputChar<parm 1 C>[#423  Parm] [flags 0x40000103 0x0 ] (cannotOverflow )  [0x7fe5f66a9f30] bci=[2,64,1998] rc=1 vc=111 vn=- li=- udi=57 nc=0 flg=0x1000
n654n     treetop                                                                             [0x7fe5f66adc20] bci=[2,66,1998] rc=0 vc=0 vn=- li=- udi=- nc=1
n462n       iconst 0 (X==0 X>=0 X<=0 )                                                        [0x7fe5f66aa020] bci=[2,66,1998] rc=1 vc=111 vn=- li=37 udi=- nc=0 flg=0x302
n655n     treetop                                                                             [0x7fe5f66adc70] bci=[2,67,1998] rc=0 vc=0 vn=- li=- udi=- nc=1
n463n       iconst 3 (X!=0 X>=0 )                                                             [0x7fe5f66aa070] bci=[2,67,1998] rc=1 vc=111 vn=- li=- udi=- nc=0 flg=0x104
n465n     treetop                                                                             [0x7fe5f66aa110] bci=[2,68,1998] rc=0 vc=111 vn=- li=- udi=- nc=1
n464n       PassThrough                                                                       [0x7fe5f66aa0c0] bci=[2,68,1998] rc=2 vc=111 vn=- li=- udi=- nc=1
n646n         iselect ()                                                                      [0x7fe5f66ad9a0] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n645n           icmpeq                                                                        [0x7fe5f66ad950] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=2
n461n             ==>b2i
n644n             iconst 162 (X!=0 X>=0 )                                                     [0x7fe5f66ad900] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
n643n           iconst 0 (X==0 X>=0 X<=0 )                                                    [0x7fe5f66ad8b0] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x302
n642n           iselect ()                                                                    [0x7fe5f66ad860] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n641n             icmpeq                                                                      [0x7fe5f66ad810] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=2
n461n               ==>b2i
n640n               iconst 163 (X!=0 X>=0 )                                                   [0x7fe5f66ad7c0] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
n639n             iconst 1 (X!=0 X>=0 )                                                       [0x7fe5f66ad770] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
n638n             iselect ()                                                                  [0x7fe5f66ad720] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n637n               icmpeq                                                                    [0x7fe5f66ad6d0] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=2
n461n                 ==>b2i
n636n                 iconst 172 (X!=0 X>=0 )                                                 [0x7fe5f66ad680] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
n635n               iconst 2 (X!=0 X>=0 )                                                     [0x7fe5f66ad630] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x104
n634n               iconst -1 (X!=0 X<=0 )                                                    [0x7fe5f66ad5e0] bci=[2,68,1998] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204

The iconst values used for the comparisons should be -94, -93 and -58, I believe.

hzongaro commented 4 months ago

In the earlier instance of this failure described in the comment above, the trees looked like this:

   6530 n547n     treetop                                                                             [    0xffff88822ab0] bci=[4,32,213] rc=0 vc=137 vn=- li=- udi=- nc=1
   6531 n546n       PassThrough                                                                       [    0xffff88822a60] bci=[4,32,213] rc=3 vc=137 vn=- li=- udi=- nc=1
   6532 n790n         iselect ()                                                                      [    0xffff888276a0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
   6533 n788n           icmpeq                                                                        [    0xffff88827600] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=2
   6534 n543n             ==>iload
   6535 n787n             iconst -94 (X!=0 X<=0 )                                                     [    0xffff888275b0] bci=[4,32,213] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x204

If the string is encoded in a char[], the values involved will be unsigned short values; if it is encoded in a byte[], the values involved will be signed byte values.

pshipton commented 4 months ago

This new failure is not necessarily related to the previous ones. However this issue for the test got reopened and reused. This latest failure occurs on all platforms and is exposed due to https://github.com/eclipse-openj9/openj9/pull/19348

hzongaro commented 4 months ago

This new failure is not necessarily related to the previous ones.

It actually is related. In JDK 11, OpenJ9 uses its own implementation of String.indexOf(int,int), which calls JITHelpers.intrinsicIndexOfLatin1 for Latin1-encoded strings. The character for which it is searching is cast to type byte on that call.

In JDK 17, the implementation of String.indexOf(int,int) calls StringLatin1.indexOf(byte[],int,int) for Latin1-encoded strings. In turn that calls StringLatin1.indexOfChar(byte[],int,int). The argument for which those methods are searching is of type int.

Value Propagation's transformation of StringLatin1.indexOfChar and JITHelpers.intrinsicIndexOfLatin1 is identical, and doesn't take into account whether the character value has been cast to a byte. The change that was made in pull request #19044 adjusted the transformation so that it would work correctly if the value was left as an int, which was correct for JDK 17, but no longer correct for Latin1-encoded Strings in JDK 11, where the value is cast as a byte. Pull request #19348 exposes that problem.

pshipton commented 4 months ago

@hzongaro is there any update on this? The 0.44 release is waiting on this.

hzongaro commented 4 months ago

is there any update on this?

Yes, I'm testing a fix. Sorry for the delay.

hzongaro commented 4 months ago

Reopening until fix is merged onto 0.44 release branch.

JasonFengJ9 commented 4 months ago

JDK11 ppc64le_linux(p10sles021)

java version "11.0.24-beta" 2024-07-16
IBM Semeru Runtime Certified Edition 11.0.24+1-202405082315 (build 11.0.24-beta+1-202405082315)
Eclipse OpenJ9 VM 11.0.24+1-202405082315 (build master-d6b16f591e, JRE 11 Linux ppc64le-64-Bit Compressed References 20240508_846 (JIT enabled, AOT enabled)
OpenJ9   - d6b16f591e
OMR      - 1bcf96982
JCL      - b17cf9f10c based on jdk-11.0.24+1)

[2024-05-09T00:56:53.948Z] variation: Mode150
[2024-05-09T00:56:53.948Z] JVM_OPTIONS:  -XX:+UseCompressedOops -Xverbosegclog

[2024-05-09T01:01:47.998Z] TEST: sun/nio/cs/TestStringCoding.java

[2024-05-09T01:01:47.999Z] Testing(sm=false) x-IBM29626C....
[2024-05-09T01:01:47.999Z] STDERR:
[2024-05-09T01:01:47.999Z] java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C
[2024-05-09T01:01:47.999Z]  at TestStringCoding.testGetBytes(TestStringCoding.java:164)
[2024-05-09T01:01:47.999Z]  at TestStringCoding.main(TestStringCoding.java:64)
[2024-05-09T01:01:47.999Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2024-05-09T01:01:47.999Z]  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2024-05-09T01:01:47.999Z]  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2024-05-09T01:01:47.999Z]  at java.base/java.lang.reflect.Method.invoke(Method.java:572)
[2024-05-09T01:01:47.999Z]  at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
[2024-05-09T01:01:47.999Z]  at java.base/java.lang.Thread.run(Thread.java:839)
[2024-05-09T01:01:47.999Z] 
[2024-05-09T01:01:47.999Z] JavaTest Message: Test threw exception: java.lang.RuntimeException: getBytes(csn) failed  -> x-IBM29626C

[2024-05-09T01:01:53.710Z] jdk_nio_0_FAILED

50x grinder - the failure not reproduced.

The JVM in question contains

This appears to be the same failure but seems less frequent and not across platforms.

hzongaro commented 4 months ago

I suspect the problem is specific to the Power 10 inline code generation for indexOf. I'm able to reproduce the failure consistently in a 10x internal grinder run on a Power 10 machine (still underway).

@zl-wang, could someone on your team take a look at this?

pshipton commented 4 months ago

I assume it was introduced via https://github.com/eclipse-openj9/openj9/pull/18819 although it could be a long standing problem that was exposed via https://github.com/eclipse-openj9/openj9/pull/19337

zl-wang commented 4 months ago

@IBMJimmyk could you take a look at this new failure in priority? it is a blocker to 0.44

IBMJimmyk commented 4 months ago

Yes, I can take a look at this.

IBMJimmyk commented 4 months ago

One problem I found is here: https://github.com/eclipse-openj9/openj9/blob/fa4755ac5d6b2eeb29597883d682e08befc27096/runtime/compiler/p/codegen/J9TreeEvaluator.cpp#L10865-L10871

ch is a signed byte when coming from intrinsicIndexOfLatin1 and a signed int when coming from indexOfChar. This check seems to check if the second lowest 8 bits are non-zero and returns -1 if this is the case. I'm not sure why it is doing this. Neither intrinsicIndexOfLatin1 nor indexOfChar have such a check. They both assume that ch is valid Latin1 and its value fits inside a signed byte. Negative values of ch can fail this check when it shouldn't.

zl-wang commented 4 months ago

@hzongaro has made it into an unsigned byte, such that it shouldn't fail there? on the other hand, we may eliminate that check for performance reason (if it is not really needed).

IBMJimmyk commented 4 months ago

I'm currently in the middle of writing a small test so I can better control what the inputs are and what they actually look like just before the call. I'll be able to answer that question after I finish that.

But, this check is definitely wrong so it will need to be removed one way or another.

JasonFengJ9 commented 4 months ago

This failure was NOT reproduced by JDK 17/21/22 at a P10 machine p10sles021 in a single run. Note: JDK11 failed consistently at this machine as per https://github.com/eclipse-openj9/openj9/issues/18974#issuecomment-2108000024

JDK17 jdk_nio_0 p10sles021 - passed JDK21 jdk_nio_0 p10sles021

14:42:45  Testing(sm=false) x-IBM29626C....done!  <-- JDK11 failed at this test

14:42:45  Testing(sm=false) x-windows-iso2022jp....done!
14:42:45  STDERR:
14:42:45  java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release

JDK22 jdk_nio_0 p10sles021

14:31:10  Testing(sm=false) x-IBM29626C....done!  <-- JDK11 failed at this test

14:31:10  Testing(sm=false) x-windows-iso2022jp....done!
14:31:10  STDERR:
14:31:10  java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release

FYI @pshipton @tajila

IBMJimmyk commented 4 months ago

for JDK11, the relevant Pre Instruction Selection Trees look like this:

n163n     BBStart <block_16> (freq 2226)                                                      [    0x7fffb30c22b0] bci=[0,52,1997] rc=0 vc=863 vn=- li=16 udi=- nc=1
n361n       GlRegDeps ()                                                                      [    0x7fffb30c6090] bci=[0,52,1997] rc=1 vc=863 vn=- li=16 udi=- nc=4 flg=0x20
n362n         aRegLoad gr31   array<auto slot 4>[#424  Auto] [flags 0x4007 0x0 ] (X!=0 SeenRealReference )  [    0x7fffb30c60e0] bci=[0,62,1998] rc=2 vc=863 vn=- li=16 udi=- nc=0 flg=0x8004
n363n         iRegLoad gr29   <parm 1 I>[#404  Parm] [flags 0xc0000103 0x0 ] (X>=0 cannotOverflow SeenRealReference )  [    0x7fffb30c6130] bci=[0,52,1997] rc=3 vc=863 vn=- li=16 udi=- nc=>
n364n         iRegLoad gr28   len<auto slot 3>[#422  Auto] [flags 0x3 0x0 ] (cannotOverflow SeenRealReference )  [    0x7fffb30c6180] bci=[0,67,1998] rc=2 vc=863 vn=- li=16 udi=- nc=0 flg=>
n365n         iRegLoad gr27   <temp slot 8>[#447  Auto] [flags 0x20000003 0x0 ] (cannotOverflow SeenRealReference )  [    0x7fffb30c61d0] bci=[0,66,1998] rc=2 vc=863 vn=- li=16 udi=- nc=0 >
n231n     ificmpgt --> block_4 BBStart at n155n ()                                            [    0x7fffb30c37f0] bci=[0,56,1997] rc=0 vc=863 vn=- li=16 udi=- nc=2 flg=0x20
n363n       ==>iRegLoad
n228n       iconst 255 (X!=0 X>=0 )                                                           [    0x7fffb30c3700] bci=[0,53,1997] rc=1 vc=863 vn=- li=16 udi=- nc=0 flg=0x104
n164n     BBEnd </block_16>                                                                   [    0x7fffb30c2300] bci=[0,56,1997] rc=0 vc=863 vn=- li=16 udi=- nc=0
n229n     BBStart <block_15> (freq 1855) (extension of previous block)                        [    0x7fffb30c3750] bci=[0,59,1998] rc=0 vc=863 vn=- li=15 udi=- nc=0
n255n     treetop                                                                             [    0x7fffb30c3f70] bci=[0,68,1998] rc=0 vc=863 vn=- li=15 udi=- nc=1
n254n       icall  com/ibm/jit/JITHelpers.intrinsicIndexOfLatin1(Ljava/lang/Object;BII)I[#441  final virtual Method -552] [flags 0x20500 0x0 ] ()  [    0x7fffb30c3f20] bci=[0,68,1998] rc=3>
n247n         aload  java/lang/String.helpers Lcom/ibm/jit/JITHelpers;[#429  final Static] (obj1) [flags 0xa0307 0x0 ]  [    0x7fffb30c3cf0] bci=[0,59,1998] rc=1 vc=863 vn=- li=15 udi=- nc>
n362n         ==>aRegLoad
n251n         b2i                                                                             [    0x7fffb30c3e30] bci=[0,65,1998] rc=1 vc=863 vn=- li=15 udi=- nc=1
n250n           i2b                                                                           [    0x7fffb30c3de0] bci=[0,65,1998] rc=1 vc=863 vn=- li=15 udi=- nc=1
n363n             ==>iRegLoad
n365n         ==>iRegLoad
n364n         ==>iRegLoad
n307n     iRegStore gr2                                                                       [    0x7fffb30c4fb0] bci=[-1,9,45] rc=0 vc=863 vn=- li=15 udi=- nc=1
n254n       ==>icall
n230n     BBEnd </block_15> =====                                                             [    0x7fffb30c37a0] bci=[0,71,1998] rc=0 vc=863 vn=- li=15 udi=- nc=1
n375n       GlRegDeps                                                                         [    0x7fffb30c64f0] bci=[0,71,1998] rc=1 vc=863 vn=- li=15 udi=- nc=1
n374n         PassThrough gr2                                                                 [    0x7fffb30c64a0] bci=[0,68,1998] rc=1 vc=863 vn=- li=15 udi=- nc=1
n254n           ==>icall

Inlining is intentionall suppressed for JITHelpers.intrinsicIndexOfLatin1 so that inlineDirectCall can recognize the call and generate the custom assembly.

ch is the byte paramter. It comes from parm 1 (n363n), gets converted to a signed byte (n250n), then sign extended back to a signed int (n251n).

The fast path assembly for intrinsicIndexOfLatin1 starts like this:

    0x7fffb8900138 000000c4 [    0x7fffb31e1080] 3840ffff          68   li      gr2, FFFFFFFFFFFFFFFF
    0x7fffb890013c 000000c8 [    0x7fffb31e1120] 7f1be000          68   cmpw    cr6, gr27, gr28
    0x7fffb8900140 000000cc [    0x7fffb31e11c0] 40980094          68   bge     cr6, Label L0117
    0x7fffb8900144 000000d0 [    0x7fffb31e1260] 7f6307b4          68   extsw   gr3, gr27
    0x7fffb8900148 000000d4 [    0x7fffb31e1300] 7f8507b4          68   extsw   gr5, gr28
    0x7fffb890014c 000000d8 [    0x7fffb31e13a0] 5406c63e          68   rlwinm  gr6, gr0, 0000000000000018, 00000000000000FF  //r0 is ch
    0x7fffb8900150 000000dc [    0x7fffb31e1450] 2f060000          68   cmpwi   cr6, gr6, 0
    0x7fffb8900154 000000e0 [    0x7fffb31e14f0] 409a0080          68   bne     cr6, Label L0117

An input of 0x80 into String.indexOf becomes a negative value:

0x00007fffb890014c in ?? ()
(gdb) p/x $r0
$12 = 0xffffffffffffff80

That fails the check at 0x7fffb8900154 and jumps when it should not producing the wrong result.

I will check JDK17 next.

IBMJimmyk commented 4 months ago

For JDK17, the tree looks like this:

n150n     BBStart <block_8> (freq 8725)                                                       [    0x7fff51eb1ea0] bci=[2,8,204] rc=0 vc=902 vn=- li=8 udi=- nc=1
n283n       GlRegDeps ()                                                                      [    0x7fff51eb4830] bci=[2,8,204] rc=1 vc=902 vn=- li=8 udi=- nc=4 flg=0x20
n284n         iRegLoad gr31   max<auto slot 3>[#438  Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow SeenRealReference )  [    0x7fff51eb4880] bci=[2,31,213] rc=2 vc=902 vn=- li=8 udi=- nc=0 f>
n285n         iRegLoad gr30   <temp slot 6>[#446  Auto] [flags 0x20000003 0x0 ] (cannotOverflow SeenRealReference )  [    0x7fff51eb48d0] bci=[2,30,213] rc=2 vc=902 vn=- li=8 udi=- nc=0 fl>
n286n         aRegLoad gr29   <temp slot 5>[#445  Auto] [flags 0x20004007 0x0 ] (X!=0 SeenRealReference )  [    0x7fff51eb4920] bci=[2,28,213] rc=2 vc=902 vn=- li=8 udi=- nc=0 flg=0x8004
n287n         iRegLoad gr28   <parm 1 I>[#404  Parm] [flags 0xc0000103 0x0 ] (cannotOverflow SeenRealReference )  [    0x7fff51eb4970] bci=[2,29,213] rc=2 vc=902 vn=- li=8 udi=- nc=0 flg=0>
n185n     treetop                                                                             [    0x7fff51eb2990] bci=[2,32,213] rc=0 vc=902 vn=- li=8 udi=- nc=1
n184n       icall  java/lang/StringLatin1.indexOfChar([BIII)I[#443  final static Method] [flags 0x20500 0x0 ] ()  [    0x7fff51eb2940] bci=[2,32,213] rc=3 vc=902 vn=- li=8 udi=- nc=4 flg=0>
n286n         ==>aRegLoad
n287n         ==>iRegLoad
n285n         ==>iRegLoad
n284n         ==>iRegLoad
n221n     iRegStore gr2                                                                       [    0x7fff51eb34d0] bci=[0,16,3191] rc=0 vc=902 vn=- li=8 udi=- nc=1
n184n       ==>icall
n98n      BBEnd </block_8> =====                                                              [    0x7fff51eb0e60] bci=[0,16,3191] rc=0 vc=902 vn=- li=8 udi=- nc=1
n297n       GlRegDeps                                                                         [    0x7fff51eb4c90] bci=[0,16,3191] rc=1 vc=902 vn=- li=8 udi=- nc=1
n292n         PassThrough gr2                                                                 [    0x7fff51eb4b00] bci=[2,32,213] rc=1 vc=902 vn=- li=8 udi=- nc=1
n184n           ==>icall

Inlining is intentionally suppressed for StringLatin1.indexOfChar so that inlineDirectCall can recognize the call and generate the custom assembly.

ch is the first int parameter. It comes from parm 1 (n287n). No conversion to byte or anything like that so the original value is directly passed into the custom assembly.

Assembly still looks like this:

    0x7fffb86a0118 000000a4 [    0x7fffb31a5840] 3840ffff          32   li      gr2, FFFFFFFFFFFFFFFF
    0x7fffb86a011c 000000a8 [    0x7fffb31a58e0] 7f1ef800          32   cmpw    cr6, gr30, gr31
    0x7fffb86a0120 000000ac [    0x7fffb31a5980] 40980094          32   bge     cr6, Label L0132
    0x7fffb86a0124 000000b0 [    0x7fffb31a5a20] 7fc307b4          32   extsw   gr3, gr30
    0x7fffb86a0128 000000b4 [    0x7fffb31a5ac0] 7fe007b4          32   extsw   gr0, gr31
    0x7fffb86a012c 000000b8 [    0x7fffb31a5b60] 5785c63e          32   rlwinm  gr5, gr28, 0000000000000018, 00000000000000FF
    0x7fffb86a0130 000000bc [    0x7fffb31a5c10] 2f050000          32   cmpwi   cr6, gr5, 0
    0x7fffb86a0134 000000c0 [    0x7fffb31a5cb0] 409a0080          32   bne     cr6, Label L0132

An input of 0x80 stays as 0x80:

0x00007fffb86a012c in ?? ()
(gdb) p/x $r28
$7 = 0x80

This will pass the check at 0x7fffb86a0134. From what I can tell, it should always pass the check so those instructions are redundant here even if there correct result is generated.

So that's why it failed under JDK11 but not JDK17. Next step is to look over the generated code to make sure it still works after this check is removed.

zl-wang commented 4 months ago

jdk11 IL looks wrong? before you remove the codegen ch value check, let's make sure there is never a case where StringLatin1 indexOf with a UTF16 ch character.

IBMJimmyk commented 4 months ago

What's wrong with the JDK11 IL? I'm not seeing the problem.

In JDK11, com/ibm/jit/JITHelpers.intrinsicIndexOfLatin1(Ljava/lang/Object;BII)I the second parameter is ch and it must be a byte. So it can't be UTF16. All callers to intrinsicIndexOfLatin1 explicitly cast the parameter to byte before the call. For bytes 0x80 to 0xFF, the upper bits of the register are set to 1 via sign extension before the call. Since the upper bits are set to 1, the check will return the wrong result and return -1 early.

In JDK17, java/lang/StringLatin1.indexOfChar([BIII)I is only called from one place. That place checks that ch can be encoded in Latin1 before calling StringLatin1.indexOfChar. If it can't, the call is not made. The second parameter of indexOfChar is an int but it holds an unsigned byte value. So the upper bits at all 0 and only the lowest 8 bits can be non-zero. Since the upper bits are always 0, the check will always pass and continue on to the rest of the code.

zl-wang commented 4 months ago

if ch hadn't gone through i2b-then-b2i transformation but kept an unsigned value, it would work even with ch value check in place, wouldn't it?

i would take what you said: StringLatin1 ch argument is strictly a byte value, such that the checking code can be removed.

IBMJimmyk commented 4 months ago

Yes. ch is a signed integer but the data it holds is always between 0x00-0xFF and the upper bits are 0. If this was directly passed into JITHelpers.intrinsicIndexOfLatin1 as the byte parameter (so no i2b, b2i), it would work since the bits other than the lowest 8 bits are 0.

But, I think the first i2b is because inside indexOf, the int c is cast to a byte when used as the ch parameter: https://github.com/eclipse-openj9/openj9/blob/8d018d6d4d511004935ab37cb2a18e661606d867/jcl/src/java.base/share/classes/java/lang/String.java#L2078 Then the b2i is inserted to widen the byte to an int to pass it into the call to intrinsicIndexOfLatin1.

IBMJimmyk commented 4 months ago

I've created a WIP PR for my proposed fix: https://github.com/eclipse-openj9/openj9/pull/19503

While I expect this to work, I haven't fully tested the change yet so it is still marked WIP.

The change remove the check to see if the upper bits of ch are 0 since it is possible for ch to be valid even if they are not 0. Instead, it zeroes out the upper bits (excluding the lowest 8 bits). This is done since ch is eventually compared against a byte loaded via lbzx and a valid match will be overlooked unless byte data in ch is zero extended in advance.

IBMJimmyk commented 4 months ago

I ran a grinder with my fix. It was JDK11 on a Power 10 machine. All the tests passed: https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/40623/

So my proposed fix does fix this issue.