Closed zheddie closed 2 years ago
hmm I am experiencing the same...thought this was an AdoptOpenJDK bug...
this doesn't seem that complicated to fix...the relevant file is here: https://github.com/eclipse/omr/blob/4c5184800ef8c4b005c9611ab87181db5dc036b4/port/linux/omrosbacktrace_impl.c#L238
and it would need to find and open .debuginfo file and find the data there
ok after digging around it seems like we shouldn't need debuginfo files and normal binary should be enough to resolve symbol names without source code filenames and lines...
I think the proper fix is to pass -rdynamic
to linker like it's currently done when running the tests...
Seems worth trying, assigning to the next milestone.
@tajila @keithc-ca @andrewcraik @fjeremic @gita-omr fyi
@dnakamura fyi
Should be a fairly simple change, however IIRC specifying -rdynamic
does have performance implications, although I don't recall if they were significant or not
@dnakamura based on this https://stackoverflow.com/a/12636790 I think we should be fine. btw, do you have any idea how the prelink
tool mentioned there could play along with JVM build? probably wouldn't work?
I'll run some perf tests with the changes.
After running the perf benchmarks it looks like startup time is unchanged however, I'm seeing approx. a 1% increase in footprint (+/- 0.5%). I'm not sure if this is significant enough for us to care
@dnakamura would you mind pushing the change somewhere? Would be great if I could make my build with adoptopenjdk scripts and already ship it on production, these symbols would be helpful
did you verify the symbols are there after creating a dump? also, are they mangled or unmangled? we could add c++ unmangling to make it nicer :)
I'm seeing approx. a 1% increase in footprint (+/- 0.5%). I'm not sure if this is significant enough for us to care
Seems a worthwhile tradeoff. @vijaysun-omr fyi
I don't have performance concerns if the overhead is indeed as small as you mentioned. If you wanted to be doubly certain, perhaps @mpirvu can run a quick test.
I'll run start-up tests provided I get two sdks, with and without the change.
where are we with this issue?
where are we with this issue?
I'd also like to see @dnakamura's prototype land. The footprint gain is minimal if not in the noise range while the benefit of seeing native backtraces will save quite a bit of developer time when looking at problems.
My changes can be seen here https://github.com/dnakamura/openj9/tree/rdynamic_test . By default the vm is still built without the -rdynamic
flag but it can be enabled by adding -DJ9VM_USE_RDYNAMIC=ON
to EXTRA_CMAKE_OPTIONS
thanks. I already have some good scripts to compile from source to production so I might try your branch next month :) but still, I think we should just ship it and enable by default.
Creating some Linux builds with https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/14/
and without https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/13
the -rdynamic
option.
@mpirvu the builds in the previous comment are ready for you.
I ran two Liberty start-up benchmarks, AcmeAir on an ivybridge machine and DT7 on a skylake machine and found no footprint difference between the two builds provided by @pshipton. Hence, we should enable the -rdynamic option, at least on x86-64.
Sorry, I just realized I messed up by building jdk8 instead of jdk11. The change only affects jdk11 since it's cmake related. Started new builds.
with https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/16/ without https://ci.eclipse.org/openj9/job/Pipeline-Build-Test-Personal/15/
@mpirvu the jdk11 builds are completed.
I only see a 0.4% footprint regression which is as big as the 95% confidence interval, so it's safe to go ahead with the change.
Results for JDK=/home/mpirvu/sdks/rdynamic/on jvmOpts=-Xcompressedrefs -Xmx256m StartupTime avg=1707 min=1555 max=1913 stdDev=61.2 maxVar=23.0% confInt=0.47% samples=160 Footprint avg=131569 min=122824 max=141580 stdDev=3172.0 maxVar=15.3% confInt=0.31% samples=160 CThreadTime avg=1399 min=1097 max=1912 stdDev=188.9 maxVar=74.3% confInt=1.76% samples=160 ProcessTime avg=3949 min=3580 max=4620 stdDev=197.9 maxVar=29.1% confInt=0.65% samples=160
Results for JDK=/home/mpirvu/sdks/rdynamic/off jvmOpts=-Xcompressedrefs -Xmx256m StartupTime avg=1701 min=1564 max=1932 stdDev=65.1 maxVar=23.5% confInt=0.50% samples=160 Footprint avg=131055 min=124072 max=139252 stdDev=3144.9 maxVar=12.2% confInt=0.31% samples=160 CThreadTime avg=1387 min=1111 max=1969 stdDev=201.3 maxVar=77.2% confInt=1.89% samples=160 ProcessTime avg=3933 min=3540 max=4760 stdDev=225.8 maxVar=34.5% confInt=0.75% samples=160
@jdmpapin do you want to create the PR. we want -rdynamic
enabled by default.
@jdmpapin do you want to create the PR. we want
-rdynamic
enabled by default.
Did you mean @dnakamura?
Yes, sorry.
The change doesn't appear to be working. Available symbols seem to be limited to those exported by a shared library, and, because we're explicitly naming the exported symbols, -rdynamic
doesn't appear to have any effect.
In the latest release,
IBM Semeru Runtime Open Edition 11.0.14.0 (build 11.0.14+9)
Eclipse OpenJ9 VM 11.0.14.0 (build openj9-0.30.0, JRE 11 Linux amd64-64-Bit Compressed References 20220128_369 (JIT enabled, AOT enabled)
OpenJ9 - 9dccbe076
OMR - dac962a28
JCL - 01f86654ee based on jdk-11.0.14+9)
the VM exports 284 symbols (as reported via nm -D lib/default/libjvm.so | grep -w T | wc -l
) which is the same as when -rdynamic
is not used.
I plan to explore using dl_iterate_phdr(3)
after discovering that neither dladdr1(..., RTLD_DL_SYMENT)
nor backtrace_symbols(3)
appeared to fare any better.
I have it working on (at least X) Linux. Here's an excerpt for -Xdump:java:events=vmstop
, which seems to agree well with gdb examing a core file triggered at the same time:
1XMTHDINFO Thread Details
NULL
3XMTHREADINFO "DestroyJavaVM helper thread" J9VMThread:0x0000000000016F00, omrthread_t:0x00007FFFF0007740, java/lang/Thread:0x00000000FFF8F1A8, state:R, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0x13, isDaemon:false)
3XMTHREADINFO1 (native thread ID:0x567F, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00001060)
3XMTHREADINFO2 (native stack address range from:0x00007FFFF7F99000, to:0x00007FFFF7FDA000, size:0x41000)
3XMCPUTIME CPU usage total: 0.182497549 secs, current category="Application"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=4096 (0x1000)
3XMTHREADINFO3 No Java callstack associated with this thread
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK protectedBacktrace+0x12 (0x00007FFFF5D0B082 [libj9prt29.so+0x26082])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK omrintrospect_backtrace_thread_raw+0xbf (0x00007FFFF5D0B48F [libj9prt29.so+0x2648f])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK omrintrospect_backtrace_thread+0x70 (0x00007FFFF5D0AF40 [libj9prt29.so+0x25f40])
4XENATIVESTACK setup_native_thread+0x1d2 (0x00007FFFF5D0C2C2 [libj9prt29.so+0x272c2])
4XENATIVESTACK omrintrospect_threads_startDo_with_signal+0x474 (0x00007FFFF5D0D064 [libj9prt29.so+0x28064])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK _ZN18JavaCoreDumpWriter28writeThreadsWithNativeStacksEv+0xa46 (0x00007FFFF5761126 [libj9dmp29.so+0x1b126])
4XENATIVESTACK protectedWriteThreadsWithNativeStacks+0xd (0x00007FFFF57612BD [libj9dmp29.so+0x1b2bd])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK _ZN18JavaCoreDumpWriter18writeThreadSectionEv+0x14d (0x00007FFFF575D8CD [libj9dmp29.so+0x178cd])
4XENATIVESTACK protectedWriteSection+0x1d (0x00007FFFF575898D [libj9dmp29.so+0x1298d])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK _ZN18JavaCoreDumpWriterC2EPKcP16J9RASdumpContextP14J9RASdumpAgent+0x41f (0x00007FFFF5759D6F [libj9dmp29.so+0x13d6f])
4XENATIVESTACK runJavadump+0x1c (0x00007FFFF5763BEC [libj9dmp29.so+0x1dbec])
4XENATIVESTACK doJavaDump+0x62 (0x00007FFFF574AA02 [libj9dmp29.so+0x4a02])
4XENATIVESTACK protectedDumpFunction+0x15 (0x00007FFFF574A0B5 [libj9dmp29.so+0x40b5])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK runDumpFunction+0x6b (0x00007FFFF574D6DB [libj9dmp29.so+0x76db])
4XENATIVESTACK runDumpAgent+0x15f (0x00007FFFF574D86F [libj9dmp29.so+0x786f])
4XENATIVESTACK triggerDumpAgents+0x34e (0x00007FFFF576603E [libj9dmp29.so+0x2003e])
4XENATIVESTACK rasDumpHookVmShutdown+0x76 (0x00007FFFF5764296 [libj9dmp29.so+0x1e296])
4XENATIVESTACK J9HookDispatch+0x12e (0x00007FFFF630F36E [libj9hookable29.so+0x136e])
4XENATIVESTACK protectedDestroyJavaVM+0x21b (0x00007FFFF677B3DB [libj9vm29.so+0x4c3db])
4XENATIVESTACK omrsig_protect+0x1e3 (0x00007FFFF5D0F743 [libj9prt29.so+0x2a743])
4XENATIVESTACK DestroyJavaVM+0x1b9 (0x00007FFFF677A7D9 [libj9vm29.so+0x4b7d9])
4XENATIVESTACK DestroyJavaVM+0x1a (0x00007FFFF6D507FA [libjvm.so+0x147fa])
4XENATIVESTACK JavaMain+0x3f4 (0x00007FFFF7BC7234 [libjli.so+0x5234])
4XENATIVESTACK ThreadJavaMain+0x9 (0x00007FFFF7BCB749 [libjli.so+0x9749])
4XENATIVESTACK start_thread+0xdb (0x00007FFFF71986DB [libpthread.so.0+0x76db])
4XENATIVESTACK clone+0x3f (0x00007FFFF78F271F [libc.so.6+0x12171f])
Java -version output
openjdk version "11.0.4" 2019-07-16 OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.4+11) Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.15.1, JRE 11 Linux amd64-64-Bit Compressed References 20190717_286 (JIT enabled, AOT enabled) OpenJ9 - 0f66c6431 OMR - ec782f26 JCL - fa49279450 based on jdk-11.0.4+11)
OS
Summary of problem
java -Xdump:java+system:events=vmstop zHello The stack in javacore file is as below. Most of the frame do not have symbol name . While we can get them from the system core dump as below:
Diagnostic files
javacore.20191206.134206.6037.0002.txt
OutOfMemoryError: Java Heap Space