eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

No symbol names in javacore native callstack (backtrace) on Linux #7995

Closed zheddie closed 2 years ago

zheddie commented 4 years ago

Java -version output

openjdk version "11.0.4" 2019-07-16 OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.4+11) Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.15.1, JRE 11 Linux amd64-64-Bit Compressed References 20190717_286 (JIT enabled, AOT enabled) OpenJ9 - 0f66c6431 OMR - ec782f26 JCL - fa49279450 based on jdk-11.0.4+11)

OS

$ uname -a
Linux oc8176481510.ibm.com 3.10.0-1062.4.1.el7.x86_64 #1 SMP Wed Sep 25 09:42:57 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux

Summary of problem

java -Xdump:java+system:events=vmstop zHello The stack in javacore file is as below. Most of the frame do not have symbol name . While we can get them from the system core dump as below:

3XMTHREADINFO      "DestroyJavaVM helper thread" J9VMThread:0x0000000001DC2800, omrthread_t:0x00007F1ECC006BD0, java/lang/Thread:0x00000007FFFA0188, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x14, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x1796, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00001060)
3XMTHREADINFO2            (native stack address range from:0x00007F1ED0798000, to:0x00007F1ED0F99000, size:0x801000)
3XMCPUTIME               CPU usage total: 0.256747226 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=4272 (0x10B0)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0x00007F1ECB8309D2 [libj9prt29.so+0x3a9d2])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1ECB830A4E [libj9prt29.so+0x3aa4e])
4XENATIVESTACK               (0x00007F1ECB830B44 [libj9prt29.so+0x3ab44])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1ECB8308AB [libj9prt29.so+0x3a8ab])
4XENATIVESTACK               (0x00007F1ECB82CF32 [libj9prt29.so+0x36f32])
4XENATIVESTACK               (0x00007F1ECB82DCD4 [libj9prt29.so+0x37cd4])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1EC4E9CFF6 [libj9dmp29.so+0x19ff6])
4XENATIVESTACK               (0x00007F1EC4E9D18D [libj9dmp29.so+0x1a18d])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1EC4E998DD [libj9dmp29.so+0x168dd])
4XENATIVESTACK               (0x00007F1EC4E953BD [libj9dmp29.so+0x123bd])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1EC4E961DE [libj9dmp29.so+0x131de])
4XENATIVESTACK               (0x00007F1EC4E9F53C [libj9dmp29.so+0x1c53c])
4XENATIVESTACK               (0x00007F1EC4E87C8D [libj9dmp29.so+0x4c8d])
4XENATIVESTACK               (0x00007F1EC4E872A5 [libj9dmp29.so+0x42a5])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1EC4E8A89B [libj9dmp29.so+0x789b])
4XENATIVESTACK               (0x00007F1EC4E8AA1C [libj9dmp29.so+0x7a1c])
4XENATIVESTACK               (0x00007F1EC4EA115B [libj9dmp29.so+0x1e15b])
4XENATIVESTACK               (0x00007F1EC4E9F606 [libj9dmp29.so+0x1c606])
4XENATIVESTACK               (0x00007F1ECBA5B318 [libj9hookable29.so+0x1318])
4XENATIVESTACK               (0x00007F1ECBD0B558 [libj9vm29.so+0xab558])
4XENATIVESTACK               (0x00007F1ECB815283 [libj9prt29.so+0x1f283])
4XENATIVESTACK               (0x00007F1ECBD0A9D9 [libj9vm29.so+0xaa9d9])
4XENATIVESTACK               (0x00007F1ED0559EFA [libjvm.so+0x17efa])
4XENATIVESTACK               (0x00007F1ED1A87E14 [libjli.so+0x4e14])
4XENATIVESTACK               (0x00007F1ED1C9BEA5 [libpthread.so.0+0x7ea5])
4XENATIVESTACK               clone+0x6d (0x00007F1ED15AF8CD [libc.so.6+0xfe8cd])
(gdb) where
#0  0x00007f1ed1ca0aa1 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00007f1ecb8311bd in omrdump_create () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9prt29.so
#2  0x00007f1ec4e8b412 in doSystemDump () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#3  0x00007f1ec4e872a5 in protectedDumpFunction () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#4  0x00007f1ecb815283 in omrsig_protect () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9prt29.so
#5  0x00007f1ec4e8a89b in runDumpFunction () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#6  0x00007f1ec4e8aa1c in runDumpAgent () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#7  0x00007f1ec4ea115b in triggerDumpAgents () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#8  0x00007f1ec4e9f606 in rasDumpHookVmShutdown () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9dmp29.so
#9  0x00007f1ecba5b318 in J9HookDispatch () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9hookable29.so
#10 0x00007f1ecbd0b558 in protectedDestroyJavaVM () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9vm29.so
#11 0x00007f1ecb815283 in omrsig_protect () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9prt29.so
#12 0x00007f1ecbd0a9d9 in DestroyJavaVM () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libj9vm29.so
#13 0x00007f1ed0559efa in DestroyJavaVM () from /mnt/disk2/localinstall/jdk-11.0.4+11/lib/compressedrefs/libjvm.so
#14 0x00007f1ed1a87e14 in JavaMain () from /mnt/disk2/localinstall/jdk-11.0.4+11/bin/../lib/jli/libjli.so
#15 0x00007f1ed1c9bea5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f1ed15af8cd in clone () from /lib64/libc.so.6

Diagnostic files

javacore.20191206.134206.6037.0002.txt

OutOfMemoryError: Java Heap Space

ninja- commented 3 years ago

hmm I am experiencing the same...thought this was an AdoptOpenJDK bug...

ninja- commented 3 years ago

https://github.com/AdoptOpenJDK/openjdk-build/issues/2245#issuecomment-731505186

ninja- commented 3 years ago

this doesn't seem that complicated to fix...the relevant file is here: https://github.com/eclipse/omr/blob/4c5184800ef8c4b005c9611ab87181db5dc036b4/port/linux/omrosbacktrace_impl.c#L238

and it would need to find and open .debuginfo file and find the data there

ninja- commented 3 years ago

ok after digging around it seems like we shouldn't need debuginfo files and normal binary should be enough to resolve symbol names without source code filenames and lines... I think the proper fix is to pass -rdynamic to linker like it's currently done when running the tests...

pshipton commented 3 years ago

Seems worth trying, assigning to the next milestone.

pshipton commented 3 years ago

@tajila @keithc-ca @andrewcraik @fjeremic @gita-omr fyi

pshipton commented 3 years ago

@dnakamura fyi

dnakamura commented 3 years ago

Should be a fairly simple change, however IIRC specifying -rdynamic does have performance implications, although I don't recall if they were significant or not

ninja- commented 3 years ago

@dnakamura based on this https://stackoverflow.com/a/12636790 I think we should be fine. btw, do you have any idea how the prelink tool mentioned there could play along with JVM build? probably wouldn't work?

dnakamura commented 3 years ago

I'll run some perf tests with the changes.

dnakamura commented 3 years ago

After running the perf benchmarks it looks like startup time is unchanged however, I'm seeing approx. a 1% increase in footprint (+/- 0.5%). I'm not sure if this is significant enough for us to care

ninja- commented 3 years ago

@dnakamura would you mind pushing the change somewhere? Would be great if I could make my build with adoptopenjdk scripts and already ship it on production, these symbols would be helpful

ninja- commented 3 years ago

did you verify the symbols are there after creating a dump? also, are they mangled or unmangled? we could add c++ unmangling to make it nicer :)

pshipton commented 3 years ago

I'm seeing approx. a 1% increase in footprint (+/- 0.5%). I'm not sure if this is significant enough for us to care

Seems a worthwhile tradeoff. @vijaysun-omr fyi

vijaysun-omr commented 3 years ago

I don't have performance concerns if the overhead is indeed as small as you mentioned. If you wanted to be doubly certain, perhaps @mpirvu can run a quick test.

mpirvu commented 3 years ago

I'll run start-up tests provided I get two sdks, with and without the change.

ninja- commented 3 years ago

where are we with this issue?

fjeremic commented 3 years ago

where are we with this issue?

I'd also like to see @dnakamura's prototype land. The footprint gain is minimal if not in the noise range while the benefit of seeing native backtraces will save quite a bit of developer time when looking at problems.

dnakamura commented 3 years ago

My changes can be seen here https://github.com/dnakamura/openj9/tree/rdynamic_test . By default the vm is still built without the -rdynamic flag but it can be enabled by adding -DJ9VM_USE_RDYNAMIC=ON to EXTRA_CMAKE_OPTIONS

ninja- commented 3 years ago

thanks. I already have some good scripts to compile from source to production so I might try your branch next month :) but still, I think we should just ship it and enable by default.

pshipton commented 3 years ago

Creating some Linux builds with https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/14/ and without https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/13 the -rdynamic option.

pshipton commented 3 years ago

@mpirvu the builds in the previous comment are ready for you.

mpirvu commented 3 years ago

I ran two Liberty start-up benchmarks, AcmeAir on an ivybridge machine and DT7 on a skylake machine and found no footprint difference between the two builds provided by @pshipton. Hence, we should enable the -rdynamic option, at least on x86-64.

pshipton commented 3 years ago

Sorry, I just realized I messed up by building jdk8 instead of jdk11. The change only affects jdk11 since it's cmake related. Started new builds.

with https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/16/ without https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/15/

pshipton commented 3 years ago

@mpirvu the jdk11 builds are completed.

mpirvu commented 3 years ago

I only see a 0.4% footprint regression which is as big as the 95% confidence interval, so it's safe to go ahead with the change.

Results for JDK=/home/mpirvu/sdks/rdynamic/on jvmOpts=-Xcompressedrefs -Xmx256m StartupTime avg=1707 min=1555 max=1913 stdDev=61.2 maxVar=23.0% confInt=0.47% samples=160 Footprint avg=131569 min=122824 max=141580 stdDev=3172.0 maxVar=15.3% confInt=0.31% samples=160 CThreadTime avg=1399 min=1097 max=1912 stdDev=188.9 maxVar=74.3% confInt=1.76% samples=160 ProcessTime avg=3949 min=3580 max=4620 stdDev=197.9 maxVar=29.1% confInt=0.65% samples=160

Results for JDK=/home/mpirvu/sdks/rdynamic/off jvmOpts=-Xcompressedrefs -Xmx256m StartupTime avg=1701 min=1564 max=1932 stdDev=65.1 maxVar=23.5% confInt=0.50% samples=160 Footprint avg=131055 min=124072 max=139252 stdDev=3144.9 maxVar=12.2% confInt=0.31% samples=160 CThreadTime avg=1387 min=1111 max=1969 stdDev=201.3 maxVar=77.2% confInt=1.89% samples=160 ProcessTime avg=3933 min=3540 max=4760 stdDev=225.8 maxVar=34.5% confInt=0.75% samples=160

pshipton commented 3 years ago

@jdmpapin do you want to create the PR. we want -rdynamic enabled by default.

jdmpapin commented 3 years ago

@jdmpapin do you want to create the PR. we want -rdynamic enabled by default.

Did you mean @dnakamura?

pshipton commented 3 years ago

Yes, sorry.

keithc-ca commented 2 years ago

The change doesn't appear to be working. Available symbols seem to be limited to those exported by a shared library, and, because we're explicitly naming the exported symbols, -rdynamic doesn't appear to have any effect.

In the latest release,

IBM Semeru Runtime Open Edition 11.0.14.0 (build 11.0.14+9)
Eclipse OpenJ9 VM 11.0.14.0 (build openj9-0.30.0, JRE 11 Linux amd64-64-Bit Compressed References 20220128_369 (JIT enabled, AOT enabled)
OpenJ9   - 9dccbe076
OMR      - dac962a28
JCL      - 01f86654ee based on jdk-11.0.14+9)

the VM exports 284 symbols (as reported via nm -D lib/default/libjvm.so | grep -w T | wc -l) which is the same as when -rdynamic is not used.

I plan to explore using dl_iterate_phdr(3) after discovering that neither dladdr1(..., RTLD_DL_SYMENT) nor backtrace_symbols(3) appeared to fare any better.

keithc-ca commented 2 years ago

I have it working on (at least X) Linux. Here's an excerpt for -Xdump:java:events=vmstop, which seems to agree well with gdb examing a core file triggered at the same time:

1XMTHDINFO     Thread Details
NULL
3XMTHREADINFO      "DestroyJavaVM helper thread" J9VMThread:0x0000000000016F00, omrthread_t:0x00007FFFF0007740, java/lang/Thread:0x00000000FFF8F1A8, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x13, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x567F, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00001060)
3XMTHREADINFO2            (native stack address range from:0x00007FFFF7F99000, to:0x00007FFFF7FDA000, size:0x41000)
3XMCPUTIME               CPU usage total: 0.182497549 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=4096 (0x1000)
3XMTHREADINFO3           No Java callstack associated with this thread
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               protectedBacktrace+0x12 (0x00007FFFF5D0B082 [libj9prt29.so+0x26082])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               omrintrospect_backtrace_thread_raw+0xbf (0x00007FFFF5D0B48F [libj9prt29.so+0x2648f])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               omrintrospect_backtrace_thread+0x70 (0x00007FFFF5D0AF40 [libj9prt29.so+0x25f40])
4XENATIVESTACK               setup_native_thread+0x1d2 (0x00007FFFF5D0C2C2 [libj9prt29.so+0x272c2])
4XENATIVESTACK               omrintrospect_threads_startDo_with_signal+0x474 (0x00007FFFF5D0D064 [libj9prt29.so+0x28064])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               _ZN18JavaCoreDumpWriter28writeThreadsWithNativeStacksEv+0xa46 (0x00007FFFF5761126 [libj9dmp29.so+0x1b126])
4XENATIVESTACK               protectedWriteThreadsWithNativeStacks+0xd (0x00007FFFF57612BD [libj9dmp29.so+0x1b2bd])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               _ZN18JavaCoreDumpWriter18writeThreadSectionEv+0x14d (0x00007FFFF575D8CD [libj9dmp29.so+0x178cd])
4XENATIVESTACK               protectedWriteSection+0x1d (0x00007FFFF575898D [libj9dmp29.so+0x1298d])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               _ZN18JavaCoreDumpWriterC2EPKcP16J9RASdumpContextP14J9RASdumpAgent+0x41f (0x00007FFFF5759D6F [libj9dmp29.so+0x13d6f])
4XENATIVESTACK               runJavadump+0x1c (0x00007FFFF5763BEC [libj9dmp29.so+0x1dbec])
4XENATIVESTACK               doJavaDump+0x62 (0x00007FFFF574AA02 [libj9dmp29.so+0x4a02])
4XENATIVESTACK               protectedDumpFunction+0x15 (0x00007FFFF574A0B5 [libj9dmp29.so+0x40b5])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               runDumpFunction+0x6b (0x00007FFFF574D6DB [libj9dmp29.so+0x76db])
4XENATIVESTACK               runDumpAgent+0x15f (0x00007FFFF574D86F [libj9dmp29.so+0x786f])
4XENATIVESTACK               triggerDumpAgents+0x34e (0x00007FFFF576603E [libj9dmp29.so+0x2003e])
4XENATIVESTACK               rasDumpHookVmShutdown+0x76 (0x00007FFFF5764296 [libj9dmp29.so+0x1e296])
4XENATIVESTACK               J9HookDispatch+0x12e (0x00007FFFF630F36E [libj9hookable29.so+0x136e])
4XENATIVESTACK               protectedDestroyJavaVM+0x21b (0x00007FFFF677B3DB [libj9vm29.so+0x4c3db])
4XENATIVESTACK               omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK               DestroyJavaVM+0x1b9 (0x00007FFFF677A7D9 [libj9vm29.so+0x4b7d9])
4XENATIVESTACK               DestroyJavaVM+0x1a (0x00007FFFF6D507FA [libjvm.so+0x147fa])
4XENATIVESTACK               JavaMain+0x3f4 (0x00007FFFF7BC7234 [libjli.so+0x5234])
4XENATIVESTACK               ThreadJavaMain+0x9 (0x00007FFFF7BCB749 [libjli.so+0x9749])
4XENATIVESTACK               start_thread+0xdb (0x00007FFFF71986DB [libpthread.so.0+0x76db])
4XENATIVESTACK               clone+0x3f (0x00007FFFF78F271F [libc.so.6+0x12171f])