eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Segmentation-Error during code change in debug session #20229

Open c-koell opened 3 weeks ago

c-koell commented 3 weeks ago

Java -version output

openjdk version "21.0.4" 2024-07-16 LTS IBM Semeru Runtime Open Edition 21.0.4.1 (build 21.0.4+7-LTS) Eclipse OpenJ9 VM 21.0.4.1 (build openj9-0.46.1, JRE 21 Windows 10 amd64-64-Bit Compressed References 20240716_260 (JIT enabled, AOT enabled) OpenJ9 - 4760d5d320 OMR - 840a9adba JCL - db3fffb417c based on jdk-21.0.4+7)

Summary of problem

If i try to change the code during debug session the jvm crashes with segmentation error.

Diagnostic files

Unhandled exception
Type=Segmentation error vmState=0x00000000
Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFBCF7ADB69 ContextFlags=0010005f
Handler1=00007FFBCF81E340 Handler2=00007FFBFEC3AC60 InaccessibleReadAddress=0000000000000818
RDI=0000004EFC3CF3C8 RSI=0000000000000090 RAX=0000000000000808 RBX=0000004EFC3CF3D0
RCX=0000000000000000 RDX=000001B06D7D76A2 R8=0000000000000000 R9=00000000000F4240
R10=0000000000989680 R11=000001B0776A9F8A R12=0000000000133778 R13=000000060543EBA0
R14=0000004EFC3CF3C8 R15=0000000000133770
RIP=00007FFBCF7ADB69 RSP=0000004EFC3CEF00 RBP=0000004EFC3CF3D8 EFLAGS=0000000000010206
FS=0053 ES=002B DS=002B
XMM0=43e0000000000000 (f: 0.000000, d: 9.223372e+18)
XMM1=4098dccccccccccd (f: 3435973888.000000, d: 1.591200e+03)
XMM2=4024000000000000 (f: 0.000000, d: 1.000000e+01)
XMM3=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM5=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM8=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM9=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM11=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM12=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM13=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
XMM15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=C:\Program Files\openjdk-21\bin\default\j9vm29.dll
Module_base_address=00007FFBCF750000 Offset_in_DLL=000000000005db69
Target=2_90_20240716_260 (Windows 10 10.0 build 19045)
CPU=amd64 (12 logical CPUs) (0x7ec12d000 RAM)
----------- Stack Backtrace -----------
(0x00007FFBCF7ADB69 [j9vm29+0x5db69])
(0x0000000000494580)
---------------------------------------

I have also dump files if needed

pshipton commented 3 weeks ago

@tajila fyi

pshipton commented 3 weeks ago

Pls provide the diagnostic files, the most important being the "core" file (not to be confused with the javacore).

c-koell commented 3 weeks ago

You can find the dump under following cloud storage https://tbox.tirol.gv.at/index.php/s/4GrR9CpkZrawBaJ with "7RDEMYbTGt" as password

pshipton commented 3 weeks ago

For some reason I can't open that link (or even https://tbox.tirol.gv.at), I get a timeout. I don't think it's my work network connection, I can't open it from my phone (different network) either.

c-koell commented 3 weeks ago

@pshipton Oh sorry. We have at the moment some problems with a ddos attack. I think we have activated some geo blocking. I will check ist and come back to you soon.

c-koell commented 3 weeks ago

@pshipton from where are you trying to connect ? EU and UK should work.

pshipton commented 3 weeks ago

We are in Canada.

c-koell commented 3 weeks ago

@pshipton i have checked the geo blocking. You should be able to access the site now.

pshipton commented 3 weeks ago

Stack trace:

[0x0]   ntdll!ZwWaitForSingleObject+0x14   0x4efc3cc068   0x7ffc3154920e   
[0x1]   KERNELBASE!WaitForSingleObjectEx+0x8e   0x4efc3cc070   0x7ffbfec39096   
[0x2]   j9prt29!omrdump_create+0x306   0x4efc3cc110   0x7ffc1f0f5e33   
[0x3]   j9dmp29!doSystemDump+0xa3   0x4efc3cc1b0   0x7ffc1f0f5925   
[0x4]   j9dmp29!protectedDumpFunction+0x15   0x4efc3cc210   0x7ffbfec3b6f6   
[0x5]   j9prt29!runInTryExcept+0x16   0x4efc3cc240   0x7ffbfec3ce30   
[0x6]   j9prt29!omrsig_protect+0x210   0x4efc3cc280   0x7ffc1f0f38cc   
[0x7]   j9dmp29!runDumpFunction+0x6d   (Inline Function)   (Inline Function)   
[0x8]   j9dmp29!runDumpAgent+0x33c   0x4efc3cc460   0x7ffc1f10cec2   
[0x9]   j9dmp29!triggerDumpAgents+0x532   0x4efc3cc940   0x7ffbcf81f50f   
[0xa]   j9vm29!generateDiagnosticFiles+0x1ef   0x4efc3ccdc0   0x7ffbfec3b6f6   
[0xb]   j9prt29!runInTryExcept+0x16   0x4efc3cd280   0x7ffbfec3ce30   
[0xc]   j9prt29!omrsig_protect+0x210   0x4efc3cd2c0   0x7ffbcf81f712   
[0xd]   j9vm29!vmSignalHandler+0x1d2   0x4efc3cd4a0   0x7ffbfec3adb4   
[0xe]   j9prt29!mainVectoredExceptionHandler+0x154   0x4efc3ce390   0x7ffc33ee9b5c   
[0xf]   ntdll!RtlDeleteAce+0x3cc   0x4efc3ce550   0x7ffc33ec2376   
[0x10]   ntdll!RtlRaiseException+0x2a6   0x4efc3ce5f0   0x7ffc33f1143e   
[0x11]   ntdll!KiUserExceptionDispatcher+0x2e   0x4efc3ce800   0x7ffbcf7adb69   
[0x12]   j9vm29!VM_DebugBytecodeInterpreterCompressed::run+0x14a49   0x4efc3cef00   0x494580   
[0x13]   0x494580!+   0x4efc3cef08   0x0   
pshipton commented 3 weeks ago

@TobiAjila pls take a look.

tajila commented 3 weeks ago

Looking at the core, I can see that we do a decompile of all thread stacks due to a SINGLE_STEP event. Afterwards it looks like we attempt to call a method handle and fail in the linkToVirtual in the Interpreter. The cause for the crash is due to the memberName object being corrupt, instead of a memberName, its a user object at/gv/tirol/csb/sap/elko/ejb/NeuePersonalMassnahmeIT.

We've seen issue like this in the past when we attempt to recreate the interpreter stack frames after OSR but run into issues if there is corruption in the OSR buffer.

tajila commented 3 weeks ago

@hzongaro @nbhuiyan Can you please take a look?

hzongaro commented 2 weeks ago

@jdmpapin, is there any chance this is another instance of the problem fixed by your pull request #20232?

jdmpapin commented 2 weeks ago

I don't think so. SINGLE_STEP sounds like debug mode, i.e. FSD with pre-execution OSR, but argument stashing happens only for post-execution OSR

c-koell commented 6 days ago

Recently i have got again a segmentation error with the latest version. I don't know if it is the same problem but i have attached the dmp file to our cloud storage (issue-20229-2.zip) https://tbox.tirol.gv.at/index.php/s/4GrR9CpkZrawBaJ with "7RDEMYbTGt" as password.