eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Assertion reading fieldID during Watched Fields #6849

Open AlenBadel opened 5 years ago

AlenBadel commented 5 years ago

Java -version output

openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-b10) Eclipse OpenJ9 VM (build openj9-0.15.1, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190717_368 (JIT enabled, AOT enabled) OpenJ9 - 0f66c6431 OMR - ec782f26 JCL - f147086df1e based on jdk8u222-b10)

Summary of problem

Running the attached test case that uses a JVMTI agent causes an assertion. https://github.com/eclipse/openj9/blob/master/runtime/jvmti/jvmtiHook.c#L2473

This problem occurs only using the option -XX:-JITInlineWatches which disables JIT compilation of methods that contain watched fields. Passes adding -Xint.

I've been able to reproduce this assertion on Linux X86, and PPC64LE Linux. My concern is this might be an issue on aarch64, and arm where JITInlineWatches is disabled by default.

Steps to reproduce:

g++ -I<path to sdk>/include/ -I<path to sdk>/include/linux/ -fPIC -c test.cpp
g++ -shared -fPIC -o libtest.so test.o -lc

<path to sdk>/bin/javac Test.java
<path to sdk>/jre/bin/java -agentpath:./libtest.so -XX:-JITInlineWatches -Djava.library.path=<workspace path> Test

Diagnostic files

15:32:58.662 0x2361600 j9jvmti.424 * ASSERTION FAILED at jvmtiHook.c:2473: ((fieldID) != NULL) JVMDUMP039I Processing dump event "traceassert", detail "" at 2019/08/26 08:32:58 - please wait. JVMDUMP032I JVM requested System dump using '/root/ws/testfromgac/core.20190826.083258.23261.0001.dmp' in response to an event JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/share/apport/apport %p %s %c %d %P" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.23290.

JVMDUMP010I System dump written to /root/ws/testfromgac/core.20190826.083258.23261.0001.dmp JVMDUMP032I JVM requested Java dump using '/root/ws/testfromgac/javacore.20190826.083258.23261.0002.txt' in response to an event JVMDUMP010I Java dump written to /root/ws/testfromgac/javacore.20190826.083258.23261.0002.txt JVMDUMP032I JVM requested Snap dump using '/root/ws/testfromgac/Snap.20190826.083258.23261.0003.trc' in response to an event JVMDUMP010I Snap dump written to /root/ws/testfromgac/Snap.20190826.083258.23261.0003.trc JVMDUMP013I Processed dump event "traceassert", detail "".

Test case: benchmark.zip

Diagnostic files: diag.zip

gacholio commented 5 years ago

This may be an instance of the code that I want to delete - the old field lookup code is not 100% reliable. Does jitCheckForDataBreakpoint appear in the stack?

AlenBadel commented 5 years ago

This may be an instance of the code that I want to delete - the old field lookup code is not 100% reliable. Does jitCheckForDataBreakpoint appear in the stack?

Yes it does.

On PPC the backtrace looks like this.

0x00007f40123e3fac {libj9jvmti29.so}{jvmtiHookCheckForDataBreakpoint} [0x7f40115d4130] 0x00007f4013b917d0 {libj9hookable29.so}{J9HookDispatch} [0x7f40115d42a0] 0x00007f401327e618 {libj9jit29.so}{_ZN22TR_PPCTableOfConstants7initTOCEP11TR_FrontEndPN2TR14PersistentInfoEm} [0x7f40115d4390] 0x00007f4012ac784c {libj9jit29.so}{_ZN11TR_J9VMBase26getResolvedInterfaceMethodEP14J9ConstantPoolP19TR_OpaqueClassBlocki} [0x7f40115d4420] 0x00007f4012a4648c {libj9jit29.so}{_ZN2J920SymbolReferenceTable14checkUserFieldEPN2TR15SymbolReferenceE} [0x7f40115d4520] 0x00007f4012b01f04 {libj9jit29.so}{_ZN24TR_J9ByteCodeIlGenerator28calculateArrayElementAddressEN2TR8DataTypeEb} [0x7f40115d4630] 0x00007f4012b15580 {libj9jit29.so}{_ZN24TR_J9ByteCodeIlGenerator6walkerEPN2TR5BlockE} [0x7f40115d4710] 0x00007f4012aecd54 {libj9jit29.so}{_ZN24TR_J9ByteCodeIlGenerator20genDFPGetHWAvailableEv} [0x7f40115d58b0] 0x00007f4012aee704 {libj9jit29.so}{_ZN24TR_J9ByteCodeIlGenerator18genILFromByteCodesEv} [0x7f40115d5b20] 0x00007f4012aeed84 {libj9jit29.so}{_ZN24TR_J9ByteCodeIlGenerator18genILFromByteCodesEv} [0x7f40115d5bc0] 0x00007f4012e5f24c {libj9jit29.so}{_ZN3OMR20ResolvedMethodSymbol29insertStoresForDeadStackSlotsEPN2TR11CompilationER15TR_ByteCodeInfoPNS1_7Tre

gacholio commented 5 years ago

Then this is almost certainly the bad code I'm referring to. We will not be fixing it, as inline watches is the way forward.

AlenBadel commented 5 years ago

Then in some cases -XX:-JITInlineWatches will never be stable. Should the capabilities to even use that option be removed?

gacholio commented 5 years ago

That's the idea, but it's not going to happen until the ARM platforms support it.

0xdaryl commented 5 years ago

I can't say definitively, but I don't expect we'll be doing any FW work on AArch32/64 for at least 6 months. We'll get to it eventually, but it's been queued in priority sequence. :-)

AlenBadel commented 5 years ago

That's the idea, but it's not going to happen until the ARM platforms support it.

-XX:-JITInlineWatches Disables inline watches.

It's documented as a usable option. If disabling is unstable, then why don't we just remove this option from docs completely. On X/P/Z it is enabled by default, and we shouldn't give the user an option to disable it. I propose we remove this option from the docs completely so no one runs into this issue. https://eclipse.github.io/openj9-docs/xxjitinlinewatches/

I understand that watched fields will still be unstable for AArch32/64. From what I see these platforms are not officially supported. If need be we can create an issue to raise awareness that this will still be a problem until FW is supported on these platforms.

DanHeidinga commented 5 years ago

We don't expect any user to have to fiddle this option. The default (enabled) should give good performance and valid results to everyone.

What the option gives us is a safety hatch to allow users to fall back to the old (buggy) behaviour they are used to if they find a problem with the new field watch code.