OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.16k stars 599 forks source link

Openliberty JIT Compiler Crashes with segmenation error while trying to get a ConnectionRequestInfo from PoolManager #29486

Open timmalich opened 3 months ago

timmalich commented 3 months ago

Describe the bug
Our started open-liberty crashes after about 3 Minutes with the dump below. We use the latest full version from docker (library/open-liberty:full) which resolves into: Open Liberty 24.0.0.8

Investigation status:

  1. The issues suddenly occurred last week. Of course we first assumed it's due to some changes in our code base. That seems not to be the case. We've gone back in time step by step for about two years, but the crash remained.
  2. As workaround we temporary added jvm.options Parameter -Xnojit. This obviously slows down the performance significantly, but open-liberty was otherwise running just fine.
  3. After that we did the same for the underlining open-liberty image. We figured out, that issue starts with version: Open Liberty 24.0.0.8 / Version Open Liberty 24.0.0.7 just runs fine.

Exception / Dump:

Unhandled exception
Type=Segmentation error vmState=0x000501ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=000076B4DFE37F50 Handler2=000076B4E40D1740 InaccessibleAddress=0000000099669966
RDI=0000000001F64278 RSI=0000000001F64278 RAX=0000000099669966 RBX=000000000220D500
RCX=0000000001F64700 RDX=0000000000000200 R8=0000000000000200 R9=000076B3F8650A00
R10=0000000001F64700 R11=0000000001F64700 R12=0000000000000156 R13=000076B4DF9C5AC0
R14=0000000001F6F040 R15=000000000220D500
RIP=000076B4DF6EFBF7 GS=0000 FS=0000 RSP=000076B4D3FF4C20
EFlags=0000000000010206 CS=0033 RBP=000076B4CC008050 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000099669966
xmm0=0000000000005800 (f: 22528.000000, d: 1.113031e-319)
xmm1=0000000001f64700 (f: 32917248.000000, d: 1.626328e-316)
xmm2=0000000000000156 (f: 342.000000, d: 1.689705e-321)
xmm3=6176616a00136f66 (f: 1273702.000000, d: 3.146501e+161)
xmm4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm5=0000000000000017 (f: 23.000000, d: 1.136351e-322)
xmm6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm7=000076b4b316e430 (f: 3004621824.000000, d: 6.448469e-310)
xmm8=ffffff0000000000 (f: 0.000000, d: -nan)
xmm9=6c6c6c686c6c6c63 (f: 1819044992.000000, d: 1.913745e+214)
xmm10=6c6c6c254557030a (f: 1163330304.000000, d: 1.913676e+214)
xmm11=0000015200000151 (f: 337.000000, d: 7.172346e-312)
xmm12=0000013d00000140 (f: 320.000000, d: 6.726727e-312)
xmm13=000001380000013f (f: 319.000000, d: 6.620627e-312)
xmm14=0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
xmm15=000001420000013b (f: 315.000000, d: 6.832826e-312)
Module=/opt/java/openjdk/lib/amd64/default/libj9jit29.so
Module_base_address=000076B4DECF2000

Method_being_compiled=com/ibm/ejs/j2c/PoolManager.reserve(Ljavax/resource/spi/ManagedConnectionFactory;Ljavax/security/auth/Subject;Ljavax/resource/spi/ConnectionRequestInfo;Ljava/lang/Object;ZZII)Lcom/ibm/ws/j2c/MCWrapper;
Target=2_90_20240802_1004 (Linux 6.8.0-38-generic)
CPU=amd64 (8 logical CPUs) (0x5dca6e000 RAM)
----------- Stack Backtrace -----------
jitGetInterfaceMethodFromCP+0x17 (0x000076B4DF6EFBF7 [libj9jit29.so+0x9fdbf7])
_ZN11TR_J9VMBase26getResolvedInterfaceMethodEP14J9ConstantPoolP19TR_OpaqueClassBlocki+0x8b (0x000076B4DEF1021B [libj9jit29.so+0x21e21b])
_ZN19TR_ResolvedJ9Method26getResolvedInterfaceMethodEPN2TR11CompilationEP19TR_OpaqueClassBlocki+0x4c (0x000076B4DEF0735C [libj9jit29.so+0x21535c])
_ZN19CollectImplementors13visitSubclassEP22TR_PersistentClassInfo+0xd9 (0x000076B4DEEF3AD9 [libj9jit29.so+0x201ad9])
_ZN18TR_SubclassVisitor15visitSubclassesEP22TR_PersistentClassInfoRN10TR_CHTable12VisitTrackerE.localalias+0x5e (0x000076B4DEEF4F6E [libj9jit29.so+0x202f6e])
_ZN18TR_SubclassVisitor5visitEP19TR_OpaqueClassBlockb+0x118 (0x000076B4DEEF5148 [libj9jit29.so+0x203148])
_ZN15TR_ClassQueries25collectImplementorsCappedEP22TR_PersistentClassInfoPP17TR_ResolvedMethodiiS3_PN2TR11CompilationEb13TR_YesNoMaybe+0x146 (0x000076B4DEEF5336 [libj9jit29.so+0x203336])
_ZN20TR_PersistentCHTable21findSingleImplementerEP19TR_OpaqueClassBlockiP17TR_ResolvedMethodPN2TR11CompilationEb13TR_YesNoMaybeb+0x91 (0x000076B4DEF2CDD1 [libj9jit29.so+0x23add1])
_ZN22TR_J9InterfaceCallSite22findCallSiteTargetImplEP12TR_CallStackP14TR_InlinerBaseP19TR_OpaqueClassBlock+0x146 (0x000076B4DF0B36B6 [libj9jit29.so+0x3c16b6])
_ZN22TR_J9InterfaceCallSite18findCallSiteTargetEP12TR_CallStackP14TR_InlinerBase+0x3a (0x000076B4DF0B3DFA [libj9jit29.so+0x3c1dfa])
_ZN14TR_InlinerBase29getSymbolAndFindInlineTargetsEP12TR_CallStackP11TR_CallSiteb+0x46d (0x000076B4DF37251D [libj9jit29.so+0x68051d])
_ZN28TR_MultipleCallTargetInliner17inlineCallTargetsEPN2TR20ResolvedMethodSymbolEP12TR_CallStackP24TR_InnerPreexistenceInfo+0x4dd (0x000076B4DF09381D [libj9jit29.so+0x3a181d])
_ZN14TR_InlinerBase15performInliningEPN2TR20ResolvedMethodSymbolE+0xae (0x000076B4DF377DDE [libj9jit29.so+0x685dde])
_ZN10TR_Inliner7performEv+0x142 (0x000076B4DF08BAF2 [libj9jit29.so+0x399af2])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii.localalias+0x855 (0x000076B4DF48E195 [libj9jit29.so+0x79c195])
_ZN3OMR9Optimizer8optimizeEv+0x1b3 (0x000076B4DF48FEB3 [libj9jit29.so+0x79deb3])
_ZN3OMR11Compilation7compileEv+0xa25 (0x000076B4DF27FE45 [libj9jit29.so+0x58de45])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4bf (0x000076B4DEE697AF [libj9jit29.so+0x1777af])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x381 (0x000076B4DEE6A7E1 [libj9jit29.so+0x1787e1])
omrsig_protect+0x239 (0x000076B4E40D23C9 [libj9prt29.so+0x2a3c9])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x372 (0x000076B4DEE68322 [libj9jit29.so+0x176322])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x128 (0x000076B4DEE68668 [libj9jit29.so+0x176668])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x35b (0x000076B4DEE6758B [libj9jit29.so+0x17558b])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x000076B4DEE678F2 [libj9jit29.so+0x1758f2])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x000076B4DEE679A2 [libj9jit29.so+0x1759a2])
omrsig_protect+0x239 (0x000076B4E40D23C9 [libj9prt29.so+0x2a3c9])
_Z21compilationThreadProcPv+0x17b (0x000076B4DEE67D6B [libj9jit29.so+0x175d6b])
thread_wrapper+0x163 (0x000076B4E409A3A3 [libj9thr29.so+0xb3a3])
(0x000076B4E45A6AC3 [libc.so.6+0x94ac3])
clone+0x44 (0x000076B4E4637A04 [libc.so.6+0x125a04])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2024/08/19 13:03:53 - please wait.

Steps to Reproduce
Currently not that easy, because the code base quite large and it's hard to reproduce a small reproducer.

Expected behavior
Open-liberty does not crash.

Diagnostic information:

Additional context
Add any other context about the problem knödeln here.

DevBoxFanBoy commented 3 months ago

What means "knödeln"?

timmalich commented 3 months ago

DO NOT DOWNLOAD EXTRACT AND RUN THE STUPID fix.zip FROM THE MALICIOUS COMMENTS ABOVE! Hope they will be deleted soon. All Users have been reported already. I downloaded and encrypted the file with the given password on spare PC. It contains a x86_64-w64-ranlib.exe and a msvcp140.dll image image

dgeissl commented 3 months ago

We could narrow down very similar issues to an official image openliberty/open-liberty:24.0.0.6-full-java8-openj9-ubi update on 21st of August 7:40PM CEST. With that the OpenJ9 version changed in our case from 1.8.0_412-b08 to 1.8.0_422-b05 (see https://github.com/eclipse-openj9/openj9/milestone/53)

We have tried many other compositions since:

to no success. For now, we hope https://github.com/eclipse-openj9/openj9/issues/20012 will be released soon, so we can double check if the issue is resolved for us.

evia-blank commented 3 months ago

The problem seems to affect all images that have been updated recently. 24.0.0.7 has not been updated, so fortunately it still works