eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

JDK 17/18 aarch64_mac Segmentation error vmState=0x00000000 #15518

Closed JasonFengJ9 closed 2 years ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build job/Test_openjdk18_j9_sanity.openjdk_aarch64_mac_testList_0/54/(macaarch64rt8):

openjdk version "18.0.1.1-ea" 2022-04-22
IBM Semeru Runtime Open Edition 18.0.1.1+2-202207080535 (build 18.0.1.1-ea+2-202207080535)
Eclipse OpenJ9 VM 18.0.1.1+2-202207080535 (build master-d8e08932f, JRE 18 Mac OS X aarch64-64-Bit 20220708_18 (JIT enabled, AOT enabled)
OpenJ9   - d8e08932f
OMR      - ff6a49823
JCL      - 8c019e1aa5 based on jdk-18.0.1.1+2)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2022-07-08T06:42:37.529Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2022-07-08T06:42:37.529Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops 

[2022-07-08T07:02:00.689Z] TEST: java/lang/String/CompactString/ToLowerCase.java

[2022-07-08T07:02:00.690Z] STDERR:
[2022-07-08T07:02:00.690Z] STATUS:Passed.
[2022-07-08T07:02:00.690Z] Unhandled exception
[2022-07-08T07:02:00.690Z] Type=Segmentation error vmState=0x00000000
[2022-07-08T07:02:00.690Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2022-07-08T07:02:00.690Z] Handler1=00000001026AA4AC Handler2=0000000102569EBC InaccessibleAddress=0000000104FFFFF8
[2022-07-08T07:02:00.690Z] x0=000000010D878700 x1=000000010D878700 x2=0000000157FB9A88 x3=0000000157FB9AA8
[2022-07-08T07:02:00.690Z] x4=0000000176832A28 x5=0000000176832A18 x6=0000000176832A10 x7=0000000176832A08
[2022-07-08T07:02:00.690Z] x8=00000002800B44C5 x9=0000000105000020 x10=0000000105000020 x11=0000000000000000
[2022-07-08T07:02:00.690Z] x12=0000000176832A28 x13=0000000176832A00 x14=00000001768DE660 x15=0000000176832DE0
[2022-07-08T07:02:00.690Z] x16=0000000176832A60 x17=00000000000008FD x18=0000000157FB9A68 x19=000000010D878700
[2022-07-08T07:02:00.690Z] x20=0000000105000020 x21=000000010D808220 x22=0000000176832C50 x23=0000000102E4BBE0
[2022-07-08T07:02:00.690Z] x24=0000000176832C50 x25=0000000102DCFA0C x26=0000000176832D50 x27=0000000176832D50
[2022-07-08T07:02:00.690Z] x28=000000010D87A668 x29(FP)=0000000176832A70 x30(LR)=0000000103BAD218 x31(SP)=0000000176832A60
[2022-07-08T07:02:00.690Z] PC=0000000103B98A74 SP=0000000176832A60
[2022-07-08T07:02:00.690Z] v0 0000000000009fe0 (f: 40928.000000, d: 2.022112e-319)
[2022-07-08T07:02:00.690Z] v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
[2022-07-08T07:02:00.690Z] v3 3faf0a32c01163a6 (f: 3222365184.000000, d: 6.062468e-02)
[2022-07-08T07:02:00.690Z] v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
[2022-07-08T07:02:00.690Z] v17 3fd56d73835b5555 (f: 2203800832.000000, d: 3.348054e-01)
[2022-07-08T07:02:00.690Z] v18 bf78306dc0633e7a (f: 3227729408.000000, d: -5.905560e-03)
[2022-07-08T07:02:00.690Z] v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
[2022-07-08T07:02:00.690Z] v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-07-08T07:02:00.690Z] v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-07-08T07:02:00.690Z] v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-07-08T07:02:00.690Z] v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-07-08T07:02:00.690Z] v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-07-08T07:02:00.690Z] Module=/Users/jenkins/workspace/Test_openjdk18_j9_sanity.openjdk_aarch64_mac_testList_0/openjdkbinary/j2sdk-image/Contents/Home/lib/default/libj9jit29.dylib
[2022-07-08T07:02:00.690Z] Module_base_address=0000000103548000 Symbol=old_slow_jitHandleInternalErrorTrap
[2022-07-08T07:02:00.690Z] Symbol_address=0000000103B98A58
[2022-07-08T07:02:00.690Z] Target=2_90_20220708_18 (Mac OS X 12.3)
[2022-07-08T07:02:00.690Z] CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
[2022-07-08T07:02:00.690Z] ----------- Stack Backtrace -----------
[2022-07-08T07:02:00.690Z] ---------------------------------------
[2022-07-08T07:02:00.690Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2022/07/08 02:45:56 - please wait.

[2022-07-08T07:02:00.691Z] TEST RESULT: Error. Program `/Users/jenkins/workspace/Test_openjdk18_j9_sanity.openjdk_aarch64_mac_testList_0/openjdkbinary/j2sdk-image/Contents/Home/bin/../bin/java' timed out (timeout set to 960000ms, elapsed time including timeout handling was 960961ms).
[2022-07-08T07:02:00.691Z] --------------------------------------------------
[2022-07-08T07:02:00.691Z] Test results: passed: 811; error: 1
[2022-07-08T07:02:00.691Z] Report written to /Users/jenkins/workspace/Test_openjdk18_j9_sanity.openjdk_aarch64_mac_testList_0/jvmtest/openjdk/report/html/report.html
[2022-07-08T07:02:00.691Z] Results written to /Users/jenkins/workspace/Test_openjdk18_j9_sanity.openjdk_aarch64_mac_testList_0/aqa-tests/TKG/output_16572625569199/jdk_lang_1/work
[2022-07-08T07:02:00.691Z] Error: Some tests failed or other problems occurred.
[2022-07-08T07:02:00.691Z] 
[2022-07-08T07:02:00.691Z] jdk_lang_1_FAILED
0xdaryl commented 2 years ago

@knn-k FYI

knn-k commented 2 years ago

x10 has the currentThread->sp value, and the code in old_slow_jitHandleInternalErrorTrap() tries to store 0 to the address x10 - 40 (= InaccessibleAddress 0x104FFFFF8).

 650a74: str     xzr, [x10, #-40]!

It is resolveFrame in VM_VMHelpers::buildJITResolveFrameWithPC() that x10 - 40 points to. The size of J9SFJITResolveFrame is 40. https://github.com/eclipse-openj9/openj9/blob/22b3d875003ecc5d93960003d0a107447a780d2b/runtime/oti/VMHelpers.hpp#L1939-L1940

knn-k commented 2 years ago

No memory is allocated at 0x104FFFFF8. There is a memory block of 8MB starting from 0x105000000.

> info mmap 0x105000020
Start Address           End Address             Size                    Size                            Read/Write/Execute
0x0000000105000000      0x00000001057fffff      0x0000000000800000      (8,388,608)
Name:   Image section @ 105000000 (8388608 bytes)
knn-k commented 2 years ago

https://github.com/eclipse-openj9/openj9/issues/15352#issuecomment-1190403943 shows a similar failure with Symbol=old_slow_jitHandleInternalErrorTrap from a different testcase.

knn-k commented 2 years ago

Running the "!stack" command against "JIT Compilation Thread-000" shows the following error:

> !threads
        !stack 0x10d863900      !j9vmthread 0x10d863900 !j9thread 0x12d012850   tid 0x109ae6f (17411695) // (main)
        !stack 0x10d878700      !j9vmthread 0x10d878700 !j9thread 0x12d013260   tid 0x109ae91 (17411729) // (JIT Compilation Thread-000)
        !stack 0x12e023b00      !j9vmthread 0x12e023b00 !j9thread 0x10500e250   tid 0x109ae98 (17411736) // (JIT Diagnostic Compilation Thread-007 Suspended)
        !stack 0x10d01a100      !j9vmthread 0x10d01a100 !j9thread 0x10500e758   tid 0x109ae99 (17411737) // (JIT-SamplerThread)
        (... omitting some other threads ...)
>  !stack 0x10d878700
<10d878700>                             JNI call-in frame
<10d878700>                             known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT

<10d878700>     !j9method 0x0000000178AD9DE0   java/lang/ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
Aug 05, 2022 6:21:47 PM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x000000010D878700
com.ibm.j9ddr.AddressedCorruptDataException: Invalid JIT return address
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkStackFrames(JITStackWalker.java:287)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker.jitWalkStackFrames(JITStackWalker.java:101)
        (... snip ...)
        at openj9.dtfjview/com.ibm.jvm.dtfjview.DTFJView.main(DTFJView.java:46)

Stack walk result: STACK_CORRUPT
JasonFengJ9 commented 2 years ago

An internal build(macaarch64rt8)

[2022-08-28T06:15:29.059Z] variation: Mode110
[2022-08-28T06:15:29.059Z] JVM_OPTIONS:  -Xjit -Xgcpolicy:gencon -Xnocompressedrefs 

[2022-08-28T06:21:17.670Z] MCL3 stderr Unhandled exception
[2022-08-28T06:21:17.670Z] MCL3 stderr Type=Segmentation error vmState=0x00000000
[2022-08-28T06:21:17.670Z] MCL3 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2022-08-28T06:21:17.670Z] MCL3 stderr Handler1=0000000104C39984 Handler2=0000000104A6DEBC InaccessibleAddress=0000000114FFFFF8
[2022-08-28T06:21:17.670Z] MCL3 stderr x0=0000000143823B00 x1=0000000143823B00 x2=000000037F755178 x3=000000016BC56A60
[2022-08-28T06:21:17.670Z] MCL3 stderr x4=000000016BC56A28 x5=000000016BC56A18 x6=000000016BC56A10 x7=000000016BC56A08
[2022-08-28T06:21:17.670Z] MCL3 stderr x8=000000015005BDFD x9=0000000115000020 x10=0000000115000020 x11=0000000000000000
[2022-08-28T06:21:17.670Z] MCL3 stderr x12=000000016BC56A28 x13=000000016BC56A00 x14=000000037F7551DC x15=000000016BC56DE0
[2022-08-28T06:21:17.670Z] MCL3 stderr x16=0000000104C7F25C x17=00000001F92D76D8 x18=000000037F755150 x19=0000000143823B00
[2022-08-28T06:21:17.670Z] MCL3 stderr x20=0000000115000020 x21=0000000122009420 x22=000000016BC56C50 x23=00000001053DB6A0
[2022-08-28T06:21:17.670Z] MCL3 stderr x24=000000016BC56C50 x25=000000010535F484 x26=000000016BC56D50 x27=000000016BC56D50
[2022-08-28T06:21:17.670Z] MCL3 stderr x28=0000000143825A68 x29(FP)=000000016BC56A70 x30(LR)=0000000106147B38 x31(SP)=000000016BC56A60
[2022-08-28T06:21:17.670Z] MCL3 stderr PC=0000000106133394 SP=000000016BC56A60
[2022-08-28T06:21:17.670Z] MCL3 stderr v0 000000000000cfa0 (f: 53152.000000, d: 2.626058e-319)
[2022-08-28T06:21:17.670Z] MCL3 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
[2022-08-28T06:21:17.670Z] MCL3 stderr v3 3fd89a3406c142db (f: 113328856.000000, d: 3.844118e-01)
[2022-08-28T06:21:17.670Z] MCL3 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
[2022-08-28T06:21:17.670Z] MCL3 stderr v7 0000000000000001 (f: 1.000000, d: 4.940656e-324)
[2022-08-28T06:21:17.670Z] MCL3 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
[2022-08-28T06:21:17.670Z] MCL3 stderr v17 3fd5580ea6555555 (f: 2790610176.000000, d: 3.334996e-01)
[2022-08-28T06:21:17.670Z] MCL3 stderr v18 bf45cc6310769755 (f: 276207456.000000, d: -6.652340e-04)
[2022-08-28T06:21:17.670Z] MCL3 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
[2022-08-28T06:21:17.670Z] MCL3 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-08-28T06:21:17.670Z] MCL3 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-08-28T06:21:17.670Z] MCL3 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-08-28T06:21:17.670Z] MCL3 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-08-28T06:21:17.670Z] MCL3 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-08-28T06:21:17.670Z] MCL3 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_testList_0/openjdkbinary/j2sdk-image/Contents/Home/lib/default/libj9jit29.dylib
[2022-08-28T06:21:17.670Z] MCL3 stderr Module_base_address=0000000105AE0000 Symbol=old_slow_jitHandleInternalErrorTrap
[2022-08-28T06:21:17.670Z] MCL3 stderr Symbol_address=0000000106133378
[2022-08-28T06:21:17.670Z] MCL3 stderr Target=2_90_20220827_42 (Mac OS X 12.3)
[2022-08-28T06:21:17.670Z] MCL3 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
[2022-08-28T06:21:17.670Z] MCL3 stderr ----------- Stack Backtrace -----------
[2022-08-28T06:21:17.670Z] MCL3 stderr ---------------------------------------
[2022-08-28T06:21:17.670Z] MCL3 stderr JVMDUMP039I Processing dump event "gpf", detail "" at 2022/08/28 02:21:16 - please wait.

[2022-08-28T08:20:53.367Z] STF 04:20:53.128 - Overall result: **FAILED**
[2022-08-28T08:20:53.367Z] 
[2022-08-28T08:20:53.367Z] SharedClasses.SCM23.MultiCL_0_FAILED
knn-k commented 2 years ago

InaccessibleAddress=0000000114FFFFF8

It tried to access an address where no memory is mapped. Same as before.

I see the following output for one of the JIT Compilation Threads with jdmpview:

> !threads
        !stack 0x143823b00      !j9vmthread 0x143823b00 !j9thread 0x13201c460   tid 0x78625c6 (126232006) // (JIT Compilation Thread-003)
        !stack 0x12208b700      !j9vmthread 0x12208b700 !j9thread 0x14382ae50   tid 0x78625c7 (126232007) // (JIT Compilation Thread-004)
        !stack 0x143836100      !j9vmthread 0x143836100 !j9thread 0x14200b850   tid 0x78625cb (126232011) // (JIT Diagnostic Compilation Thread-007 Suspended)
        !stack 0x144010900      !j9vmthread 0x144010900 !j9thread 0x14200bd58   tid 0x78625ea (126232042) // (JIT-SamplerThread)
        !stack 0x12209ed00      !j9vmthread 0x12209ed00 !j9thread 0x14200c260   tid 0x78625eb (126232043) // (IProfiler)
        !stack 0x14394cf00      !j9vmthread 0x14394cf00 !j9thread 0x143949650   tid 0x78625f3 (126232051) // (Common-Cleaner)
        !stack 0x1220cd700      !j9vmthread 0x1220cd700 !j9thread 0x14380b650   tid 0x78625b3 (126231987) // (DestroyJavaVM helper thread)
> !stack 0x143823b00
<143823b00>                             JNI call-in frame
<143823b00>                             known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT

<143823b00>     !j9method 0x0000000118A9F788   java/lang/ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
Aug 31, 2022 10:18:45 AM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x0000000143823B00
com.ibm.j9ddr.AddressedCorruptDataException: Invalid JIT return address
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkStackFrames(JITStackWalker.java:287)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker.jitWalkStackFrames(JITStackWalker.java:101)
        (... snip ...)
        at openj9.dtfjview/com.ibm.jvm.dtfjview.DTFJView.main(DTFJView.java:46)

Stack walk result: STACK_CORRUPT
knn-k commented 2 years ago

I need ideas on how to debug this -- What is happening in the JIT compilation thread upon crash? Native stack trace is unavailable on AArch64 macOS.

pshipton commented 2 years ago

Can the system core be opened in a native debugger to get a native stack trace?

knn-k commented 2 years ago

That does not work as expected, unfortunately. I get the following output by running thread backtrace all in lldb with the core file.

(lldb) thread backtrace all
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x000000019f42c8d0 libsystem_kernel.dylib`mach_msg_trap + 8
  thread #2, stop reason = signal SIGSTOP
    frame #0: 0x000000019f42e854 libsystem_kernel.dylib`__ulock_wait + 8
  thread #3, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
  thread #4, stop reason = signal SIGSTOP
    frame #0: 0x000000019f45b584 libsystem_kernel.dylib`sem_wait + 8
  thread #5, stop reason = signal SIGSTOP
    frame #0: 0x000000019f433cbc libsystem_kernel.dylib`__wait4 + 8
  thread #6, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
  thread #7, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
  thread #8, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
  thread #9, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
  thread #10, stop reason = signal SIGSTOP
    frame #0: 0x000000019f430290 libsystem_kernel.dylib`__psynch_cvwait + 8
knn-k commented 2 years ago

I ran a 30x Grinder job with the -Xrs option, but it failed to reproduce the failure. internal job/Grinder/27171/

knn-k commented 2 years ago

Another Grinder job didn't crash, either. job/Grinder/27189/

knn-k commented 2 years ago

I should have looked at the javacore file earlier.

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "JIT Compilation Thread-003" J9VMThread:0x0000000143823B00, omrthread_t:0x000000013201C460, java/lang/Thread:0x000000028021D518, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x6, isDaemon:true)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x00000002802251B8)
3XMTHREADINFO1            (native thread ID:0x78625C6, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000060)
3XMTHREADINFO2            (native stack address range from:0x000000016BC00000, to:0x000000016BD03000, size:0x103000)
3XMCPUTIME               CPU usage total: 14.978065000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=65176 (0xFE98)
1INTERNAL                    Unable to obtain lock context information
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
    (... more than 300 recursive calls omitted ...)
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/Thread.interrupted(Thread.java(Compiled Code))
4XESTACKTRACE                at java/lang/Thread.exit(Thread.java:1392)
4XESTACKTRACE                at java/lang/J9VMInternals.threadCleanup(J9VMInternals.java:305)
3XMTHREADINFO3           No native callstack available for this thread

There is no information on the memory block starting from 0x115000020 in the javacore file.

knn-k commented 2 years ago

Issue #14621 looks the same.

ymanton commented 2 years ago

It doesn't make sense to me that a dedicated JIT compilation thread would be executing such Java code.

If writing to the code cache without calling pthread_jit_write_protect_np() causes a SIGBUS in compiled code could the signal handler be confused and think that the compilation thread was trying to execute the compiled method?

The JIT signal handler for aarch64 is here:

https://github.com/eclipse-openj9/openj9/blob/27129748bb67a36edcae7ce612a491ab19176433/runtime/compiler/runtime/SignalHandler.c#L1928-L1941

This code appears to me to assume that it will only be called on application threads. It calls jitGetExceptionTableFromPC() and if it finds an exception table it calls jitHandleNullPointerExceptionTrap for SIGSEGV and jitHandleInternalErrorTrap for SIGBUS. I don't see any check for whether the current thread is an application or compilation thread.

The question is does jitGetExceptionTableFromPC() return an exception table? Perhaps for a previously compiled body that occupied the code cache space that the current method is being written to?

knn-k commented 2 years ago

It doesn't make sense to me that a dedicated JIT compilation thread would be executing such Java code.

I agree it is confusing. What I know is the javacore file says "JIT Compilation Thread-003" is the current thread, and the thread has a long Java call stack.

If writing to the code cache without calling pthread_jit_write_protect_np() causes a SIGBUS in compiled code could the signal handler be confused and think that the compilation thread was trying to execute the compiled method?

Writing to the code cache without permission is just one of possible reasons of SIGBUS. I don't think we should assume jitHandleInternalErrorTrap was called for a SIGBUS with the code cache. It explains why this crash is reported only on macOS if the SIGBUS was caused by access to the code cache, though.

Something happened (SIGBUS) -> jitHandleInternalErrorTrap() was called -> SEGV by accessing currentThread->sp - 40 in jitHandleInternalErrorTrap()

knn-k commented 2 years ago

It is likely these macOS issues share the same problem: #14621, #15352, #15518 (this issue)

0xdaryl commented 2 years ago

Agree this is strange. Since SIGBUS's should not normally occur, can you add some instrumentation in the signal handler that Younes referenced above for the SIGBUS case that confirms that indeed a SIGBUS signal was received and what the faulting address was before we attempted to "handle" it?

JasonFengJ9 commented 2 years ago

A similar error occurred at an internal JDK17 build(macaarch64rt8) Put it here unless otherwise.

[2022-09-08T07:20:08.683Z] variation: Mode110
[2022-09-08T07:20:08.683Z] JVM_OPTIONS:  -Xjit -Xgcpolicy:gencon -Xnocompressedrefs 

[2022-09-08T07:20:10.798Z] openjdk version "17.0.5-ea" 2022-10-18
[2022-09-08T07:20:10.798Z] IBM Semeru Runtime Open Edition 17.0.5.0-m1 (build 17.0.5-ea+5)
[2022-09-08T07:20:10.798Z] Eclipse OpenJ9 VM 17.0.5.0-m1 (build v0.35.0-release-1de4f14ba, JRE 17 Mac OS X aarch64-64-Bit 20221017_140 (JIT enabled, AOT enabled)
[2022-09-08T07:20:10.798Z] OpenJ9   - 1de4f14ba
[2022-09-08T07:20:10.798Z] OMR      - 938f0686f
[2022-09-08T07:20:10.798Z] JCL      - 37e17cdb684 based on jdk-17.0.5+5)

[2022-09-08T07:27:30.997Z] MCL2 03:27:30 >> Total classes loaded = 20001
[2022-09-08T07:27:30.997Z] MCL2 stderr Unhandled exception
[2022-09-08T07:27:30.997Z] MCL2 stderr Type=Segmentation error vmState=0x00000000
[2022-09-08T07:27:30.997Z] MCL2 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2022-09-08T07:27:30.997Z] MCL2 stderr Handler1=0000000102921AC8 Handler2=00000001025F1EBC InaccessibleAddress=0000000000000030
[2022-09-08T07:27:30.997Z] MCL2 stderr x0=0000600001524140 x1=0000000116013A68 x2=0000000000000000 x3=0000000000000000
[2022-09-08T07:27:30.997Z] MCL2 stderr x4=0000000000000000 x5=0000000000000000 x6=000000016DE4A670 x7=000000016DE4A668
[2022-09-08T07:27:30.997Z] MCL2 stderr x8=0000000000080000 x9=0000000000000000 x10=0000000000000000 x11=0000000000000000
[2022-09-08T07:27:30.997Z] MCL2 stderr x12=0000000000000000 x13=0000150000001500 x14=0000000000000001 x15=0000000000000000
[2022-09-08T07:27:30.997Z] MCL2 stderr x16=000000019F47E550 x17=00000001F92D76D8 x18=00000001028208C9 x19=0000000116013A68
[2022-09-08T07:27:30.997Z] MCL2 stderr x20=0000000116013A68 x21=0000000116011B00 x22=000000010252DE28 x23=0000000000000001
[2022-09-08T07:27:30.997Z] MCL2 stderr x24=0000000116011B00 x25=0000000102625944 x26=000000011600EE50 x27=000000010263A000
[2022-09-08T07:27:30.998Z] MCL2 stderr x28=000000010252DE28 x29(FP)=000000016DE4AAD0 x30(LR)=0000000103051B48 x31(SP)=000000016DE4AAC0
[2022-09-08T07:27:30.998Z] MCL2 stderr PC=0000000103051B78 SP=000000016DE4AAC0
[2022-09-08T07:27:30.998Z] MCL2 stderr v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v1 000000000000000a (f: 10.000000, d: 4.940656e-323)
[2022-09-08T07:27:30.998Z] MCL2 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
[2022-09-08T07:27:30.998Z] MCL2 stderr v3 0000000116011b00 (f: 369171200.000000, d: 2.304391e-314)
[2022-09-08T07:27:30.998Z] MCL2 stderr v4 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
[2022-09-08T07:27:30.998Z] MCL2 stderr v5 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
[2022-09-08T07:27:30.998Z] MCL2 stderr v6 000003b8000003f0 (f: 1008.000000, d: 2.020140e-311)
[2022-09-08T07:27:30.998Z] MCL2 stderr v7 00000370000003b8 (f: 952.000000, d: 1.867356e-311)
[2022-09-08T07:27:30.998Z] MCL2 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v16 0000001800000018 (f: 24.000000, d: 5.092790e-313)
[2022-09-08T07:27:30.998Z] MCL2 stderr v17 0000001800000018 (f: 24.000000, d: 5.092790e-313)
[2022-09-08T07:27:30.998Z] MCL2 stderr v18 0000001800000018 (f: 24.000000, d: 5.092790e-313)
[2022-09-08T07:27:30.998Z] MCL2 stderr v19 0000001800000018 (f: 24.000000, d: 5.092790e-313)
[2022-09-08T07:27:30.998Z] MCL2 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.998Z] MCL2 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.998Z] MCL2 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.998Z] MCL2 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.998Z] MCL2 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.998Z] MCL2 stderr Unhandled exception
[2022-09-08T07:27:30.998Z] MCL2 stderr Type=Segmentation error vmState=0xffffffff
[2022-09-08T07:27:30.998Z] MCL2 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2022-09-08T07:27:30.998Z] MCL2 stderr Handler1=0000000102921A7C Handler2=00000001025F1EBC InaccessibleAddress=0000000000000029
[2022-09-08T07:27:30.998Z] MCL2 stderr x0=000000011600F860 x1=000000012500A860 x2=0000000000000000 x3=000060000123C3E0
[2022-09-08T07:27:30.998Z] MCL2 stderr x4=0000000000000000 x5=00000000000001FA x6=000000016DFC5B70 x7=000000016DFC6608
[2022-09-08T07:27:30.998Z] MCL2 stderr x8=000000011600EE50 x9=0000000000000002 x10=0000000000000000 x11=00000000000000AB
[2022-09-08T07:27:30.998Z] MCL2 stderr x12=0000000000000001 x13=0000000090A3F7FB x14=0000000090C3F800 x15=000000000000007F
[2022-09-08T07:27:30.998Z] MCL2 stderr x16=000000019F4650FC x17=0000000010C00000 x18=000000037F6B09C8 x19=000000012500A860
[2022-09-08T07:27:30.999Z] MCL2 stderr x20=0000000000000000 x21=0000000000000000 x22=000000016DFC5B78 x23=000000000000000A
[2022-09-08T07:27:30.999Z] MCL2 stderr x24=0000000103228280 x25=000000010252DE28 x26=0000000000000000 x27=0000000080000002
[2022-09-08T07:27:30.999Z] MCL2 stderr x28=000060000123C3E0 x29(FP)=000000016DFC5840 x30(LR)=0000000102557C58 x31(SP)=000000016DFC57F0
[2022-09-08T07:27:30.999Z] MCL2 stderr PC=000000010255BFA0 SP=000000016DFC57F0
[2022-09-08T07:27:30.999Z] MCL2 stderr v0 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.999Z] MCL2 stderr v1 000000000000000c (f: 12.000000, d: 5.928788e-323)
[2022-09-08T07:27:30.999Z] MCL2 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
[2022-09-08T07:27:30.999Z] MCL2 stderr v3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v4 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
[2022-09-08T07:27:30.999Z] MCL2 stderr v5 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
[2022-09-08T07:27:30.999Z] MCL2 stderr v6 000003b8000003f0 (f: 1008.000000, d: 2.020140e-311)
[2022-09-08T07:27:30.999Z] MCL2 stderr v7 00000370000003b8 (f: 952.000000, d: 1.867356e-311)
[2022-09-08T07:27:30.999Z] MCL2 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
[2022-09-08T07:27:30.999Z] MCL2 stderr v17 3fd55a3117603555 (f: 392181088.000000, d: 3.336299e-01)
[2022-09-08T07:27:30.999Z] MCL2 stderr v18 bf5371fc1b480544 (f: 457704768.000000, d: -1.186844e-03)
[2022-09-08T07:27:30.999Z] MCL2 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
[2022-09-08T07:27:30.999Z] MCL2 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.999Z] MCL2 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.999Z] MCL2 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.999Z] MCL2 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
[2022-09-08T07:27:30.999Z] MCL2 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2022-09-08T07:27:30.999Z] MCL2 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_testList_0/openjdkbinary/j2sdk-image/Contents/Home/lib/default/libj9thr29.dylib
[2022-09-08T07:27:30.999Z] MCL2 stderr Module_base_address=0000000102550000 Symbol=omrthread_spinlock_acquire
[2022-09-08T07:27:30.999Z] MCL2 stderr Symbol_address=000000010255BF7C
[2022-09-08T07:27:30.999Z] MCL2 stderr Target=2_90_20221017_140 (Mac OS X 12.3)
[2022-09-08T07:27:30.999Z] MCL2 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
[2022-09-08T07:27:31.000Z] MCL2 stderr 
[2022-09-08T07:27:31.000Z] MCL2 stderr Unhandled exception in signal handler. Protected function: generateDiagnosticFiles (0x0)
[2022-09-08T07:27:31.000Z] MCL2 stderr 
[2022-09-08T07:27:31.000Z] MCL2 stderr 
[2022-09-08T07:27:31.000Z] MCL2 stderr Unhandled exception in signal handler. Protected function: reportThreadCrash (0x0)
[2022-09-08T07:27:31.000Z] MCL2 stderr 
[2022-09-08T07:27:31.000Z] STF 03:27:30.234 - **FAILED** Process MCL2 ended with exit code (255) and not the expected exit code/s (0)

[2022-09-08T07:27:33.523Z] SharedClasses.SCM23.MultiCL_0_FAILED
knn-k commented 2 years ago

I ran Grinder jobs for SCM23.MultiCL_0, but the crash was not reproduced in 80 runs in total. internal job/Grinder/27331, 27351

knn-k commented 2 years ago

I tried two more Ginder jobs (40x each) for SharedClasses.SCM23.MultiCL_0. No crashes. internal job/Grinder/27363, 27454

pshipton commented 2 years ago

Tentatively tagging as a blocker for amac being removed from EA.

knn-k commented 2 years ago

Yet another Grinder job (40x) with no crash: job/Grinder/27459

knn-k commented 2 years ago

My local testing with debug prints (2 samples) shows the following:

It seems a thread tries to execute jitted J9VMInternals::threadCleanup() without an appropriate permission, and it results in the first call to jitARM64Handler(SIGBUS). I don't know how, but that seems to trigger the following calls to jitted ClassLoader::loadClass(), which also fail by not having the execution permission.

The next question would be: Who calls J9VMInternals::threadCleanup() without the permission in this case?

Akira1Saitoh commented 2 years ago

I suspect that there are some cases where the thread fails to restore write protection because a C/C++ exception is thrown between pthread_jit_write_protect_np() calls. In such situation, the thread will not have execute permission for memory region of code cache after the exception is caught. I do not have any evidence that it is the root cause of this issue, though.

For example, there are some codes allocating memory in OMR::ARM64::CodeGenerator::doBinaryEncoding(). If one of them throws std::bad_alloc, then the compilation thread would fail to call pthread_jit_write_protect_np(1). One of the ways to prevent this would be to use RAII for pthread_jit_write_protect_np() calls. https://github.com/eclipse/omr/blob/38e24339e79f9959b6ab8996f6b2ce10ed437141/compiler/aarch64/codegen/OMRCodeGenerator.cpp#L291-L336

knn-k commented 2 years ago

I reproduced the crash with repeated calls to jitARM64Handler() in an internal Grinder job 27566.

The following is from the debug print added in jitARM64Handler().

[2022-09-15T00:49:38.395Z] MTM4 stderr @@1 sigType = 40, pcPtr = 1416320c0, excTable=0x110ef6a28
[2022-09-15T00:49:38.395Z] MTM4 stderr @@1 sigType = 40, pcPtr = 140a5a0fc, excTable=0x12022fe28
[2022-09-15T00:49:38.395Z] MTM4 stderr @@1 sigType = 40, pcPtr = 140a5a0fc, excTable=0x12022fe28

sigType is 40, which is J9PORT_SIG_FLAG_SIGBUS. Exception table is non-NULL. The pcPtr values in it are equal to the starting address of jitted code of threadCleanup() and loadClass() as you can find in the "info jitm" output of the core file:

start=0x1416320c0  end=0x141632bf4   java/lang/J9VMInternals::threadCleanup(Ljava/lang/Thread;)V
start=0x140a5a0fc  end=0x140a5a3e4   java/lang/ClassLoader::loadClass(Ljava/lang/String;)Ljava/lang/Class;
knn-k commented 2 years ago

I found a "Unable to locate JIT stack map" case in my local testing, as shown below.

    MTM4 15:41:20 >> Loaded 20000 classes...
    MTM4 stderr @@1 sigType = 40, pcPtr = 11823ab40, excTable=0x1702fc1e8 <- Debug print added in jitARM64Handler()
    MTM4 stderr JVMCDRT000E Unable to locate JIT stack map - aborting VM
    MTM4 stderr JVMCDRT001E Method: java/lang/J9VMInternals.threadCleanup(Ljava/lang/Thread;)V (0000000388009100)
    MTM4 stderr JVMCDRT002E Failing PC: 000000011823AB41 (offset 0000000000000001), metaData = 00000001702FC1E8
    STF 15:41:21.205 - **FAILED** Process MTM4 ended with exit code (255) and not the expected exit code/s (0)

The process for MTM4 was exiting, and jitARM64Handler() was called with SIGBUS for J9VMInternals.threadCleanup(). The VM terminated without calls to jitARM64Handler() for ClassLoader.loadClass() in this case.

knn-k commented 2 years ago

I opened PR #15907 as a fix for this issue. I think the PR also fixes some other intermittent issues on macOS.