eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Crash in isAnyClassLoadedFromPackage #20224

Open tajila opened 3 weeks ago

tajila commented 3 weeks ago

From @tam512

I saw the following crash when performing checkpoint with Open Liberty container image stg.icr.io/cp/olc/open-liberty-daily:kernel-slim-java17-openj9-ubi

podman build -t dt10-jms:ol-kernel-java17-x86 --cap-add=CHECKPOINT_RESTORE --cap-add=SYS_PTRACE --cap-add=SETPCAP --security-opt seccomp=unconfined -f Containerfile --no-cache --volume /root/libertyrepo:/opt/libertyrepo --build-arg FULL_IMAGE="false" --build-arg BASE_IMAGE="stg.icr.io/cp/olc/open-liberty-daily:kernel-slim-java17-openj9-ubi" .

................
.........................
STEP 14/14: RUN checkpoint.sh afterAppStart
Performing checkpoint --at=afterAppStart

Launching defaultServer (Open Liberty 24.0.0.10/wlp-1.0.94.cl241020240923-1902) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT   ] CWWKE0001I: The server defaultServer has been launched.
[AUDIT   ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/keystore.xml
[AUDIT   ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/open-default-port.xml
[ERROR   ] CWWKG0075E: The value ${env.dbPort} is not valid for attribute portNumber of configuration element dataSource. The validation message was: Value "${env.dbPort}" is not a number..
[AUDIT   ] CWWKZ0058I: Monitoring dropins for applications.
[ERROR   ] CWWKE0702E: Could not resolve module: io.openliberty.connectors.security.internal.inbound [357]
  Unresolved requirement: Import-Package: jakarta.security.auth.message.callback; version="[2.0.0,4.0.0)"

Unhandled exception
Type=Segmentation error vmState=0x00000000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000080
Handler1=00007F01EE553620 Handler2=00007F01EFB86740 InaccessibleAddress=0000000000000000
RDI=136A3F40BADBAD20 RSI=00007F0192044710 RAX=136A3F40BADBAD20 RBX=136A3F40BADBAD20
RCX=0000000000000001 RDX=00007F0192044710 R8=0000000000000000 R9=00000000000000C8
R10=FFFFFFFFFFFFF346 R11=00007F01EF0DD8C0 R12=FFFFFFFFFFFFFFFF R13=00007F016C0190E0
R14=00007F01920447B0 R15=00007F016C050D10
RIP=00007F01EE6B68C0 GS=0000 FS=0000 RSP=00007F01920446B8
EFlags=0000000000010202 CS=0033 RBP=00007F0192044710 ERR=0000000000000000
TRAPNO=000000000000000D OLDMASK=0000000000000000 CR2=0000000000000000
xmm0=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1=2f617472616b616a (f: 1634427264.000000, d: 1.840127e-80)
xmm2=00000000ff000000 (f: 4278190080.000000, d: 2.113707e-314)
xmm3=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm4=ff00000000000000 (f: 0.000000, d: -5.486124e+303)
xmm5=61614c2800390043 (f: 3735619.000000, d: 1.215936e+161)
xmm6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm8=00000000ff00ff00 (f: 4278255360.000000, d: 2.113739e-314)
xmm9=69617274736e6f43 (f: 1936617344.000000, d: 4.173401e+199)
xmm10=3b3d2c723d282e3d (f: 1026043456.000000, d: 2.413185e-23)
xmm11=0000015200000151 (f: 337.000000, d: 7.172346e-312)
xmm12=0000013d00000140 (f: 320.000000, d: 6.726727e-312)
xmm13=000001380000013f (f: 319.000000, d: 6.620627e-312)
xmm14=0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
xmm15=000001420000013b (f: 315.000000, d: 6.832826e-312)
Module=/opt/java/openjdk/lib/default/libj9vm29.so
Module_base_address=00007F01EE50F000
Target=2_90_20240716_840 (Linux 5.15.0-122-generic)
CPU=amd64 (4 logical CPUs) (0x1f016c000 RAM)
----------- Stack Backtrace -----------
packageNameLength+0x0 (0x00007F01EE6B68C0 [libj9vm29.so+0x1a78c0])
getPackageName+0x1f (0x00007F01EE6B693F [libj9vm29.so+0x1a793f])
classHashGetName.constprop.0+0x107 (0x00007F01EE575BB7 [libj9vm29.so+0x66bb7])
classHashEqualFn+0x71 (0x00007F01EE576841 [libj9vm29.so+0x67841])
hashTableFind+0xd8 (0x00007F01EE6BD5F8 [libj9vm29.so+0x1ae5f8])
isAnyClassLoadedFromPackage+0x22 (0x00007F01EE577492 [libj9vm29.so+0x68492])
JVM_DefineModule+0x1840 (0x00007F01EFC0BE20 [libjvm.so+0x14e20])
 (0x00007F01718A67E9 [<unknown>+0x0])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2024/09/24 19:13:38 - please wait.
JVMDUMP032I JVM requested System dump using '/opt/ol/wlp/output/defaultServer/core.20240924.191338.1025.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" specifies that the core dump is to be piped to an external program.  Attempting to rename either core or core.1107.  Review the manual for the external program to find where the core dump is written and ensure the program does not truncate it.

JVMPORT049I The core file created by child process with pid = 1107 was not found. Review the documentation for the /proc/sys/kernel/core_pattern program "|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" to find where the core file is written and ensure that program does not truncate it.

JVMDUMP012E Error in System dump: /opt/ol/wlp/output/defaultServer/core.20240924.191338.1025.0001.dmp
JVMDUMP027W The requested heapdump has not been produced because another component is holding the VM exclusive lock.
JVMDUMP032I JVM requested Java dump using '/opt/ol/wlp/output/defaultServer/javacore.20240924.191338.1025.0003.txt' in response to an event
tajila commented 3 weeks ago

Looks very similar to https://github.com/eclipse-openj9/openj9/issues/18907

tajila commented 3 weeks ago

If this is reproduceable, then it will become a blocker. So far its only been seen once in Liberty testing. Waiting for more diagnostics.

tajila commented 2 weeks ago

Crash has not been reproduced yet, Ill move to the next milestone for the time being.