Open tajila opened 1 month ago
Issue Number: 20310 Status: Open Recommended Components: comp:vm, comp:gc, comp:build Recommended Assignees: dmitripivkine, babsingh, tajila
Use a standard tool like valgrind to get an understaning of what contributes to the memory footprint.
I have already collected data with smaps, so if using valgrind is unsuccessful of too complex I can give you this data.
Attempt to install a custom malloc/free to track allocations by adding an allocator lib to the LD_PRELOAD path.
I'm not sure how effective this would be. Our memory allocator APIs (j9mem_allocate_memory
and the like), use malloc
internally, so any custom tracking with an LD_PRELOAD
ed library would be a catch all for those allocations as well.
I think the way to do this is to set up a wrapper (I guess similar to this) that forwards all malloc
and free
calls (and other allocation functions if there are any) to the corresponding j9mem
function, then forward a specifying flag (something like J9MEM_CATEGORY_VM_JCL_NATIVES
) that we can match on in our j9mem
(omrmem
) functions to set up the list mechanism.
Are we still wanting to push this forward too? i.e. disclaiming the ClassLoader
segments on checkpoint? Are there any other memory segment lists that we could be traversing and madvise
ing out?
I think the way to do this is to set up a wrapper (I guess similar to this) that forwards all malloc and free calls (and other allocation functions if there are any) to the corresponding j9mem function, then forward a specifying flag (something like J9MEM_CATEGORY_VM_JCL_NATIVES) that we can match on in our j9mem (omrmem) functions to set up the list mechanism.
Would this wrapper mechanism work for calls to malloc/free that are made outside J9 code? possibly even before the j9vm29.so is linked?
Are we still wanting to push this forward too? i.e. disclaiming the ClassLoader segments on checkpoint?
@ymanton has been working on this.
Are there any other memory segment lists that we could be traversing and madviseing out?
The most obvious candidates have already been addressed
@ymanton has been working on this.
I'm currently doing performance testing to figure out how the rate of periodic disclaim affects throughput on various Liberty-based benchmarks. In addition to JCL I hacked up a quick prototype to disclaim SCC memory.
Would this wrapper mechanism work for calls to malloc/free that are made outside J9 code? possibly even before the j9vm29.so is linked?
It would work for whatever we link it in with. If we are targeting the JCL native allocations, we would just need to link to the overrides when compiling the JCL native code. We'd obviously need to have the object file (and .so
module containing the j9mem
/omrmem
functions) compiled before linking the JCL native code.
made outside J9 code?
Do you mean outside the extension repo + OpenJ9 + OMR
code? If so then no, it would not by default pick up all malloc
requests. Are we wanting to target malloc
s outside of the JCL native code (or OpenJ9) too?
It would work for whatever we link it in with. If we are targeting the JCL native allocations, we would just need to link to the overrides when compiling the JCL native code. We'd obviously need to have the object file (and .so module containing the j9mem/omrmem functions) compiled before linking the JCL native code.
Sounds good. Im fairly certain we build openj9 before building the JCL natives.
Do you mean outside the extension repo + OpenJ9 + OMR code? If so then no, it would not by default pick up all malloc requests. Are we wanting to target mallocs outside of the JCL native code (or OpenJ9) too?
I just mean outside Openj9 + OMR. I'm mostly concerned about tracking/disclaiming allocations in the extensions repo since this is the biggest unknown at the moment. It would be nice to eventually track/disclaim allocations done with libc if it is significant (which is why I suggested the preload approach) but we can revisit after taking a look at allocations in the extensions repo.
The goal of this item is to capture and understand what contributes to JVM memory footprint, then assess the impact of perodically disclaiming this memory from a footprint and throughput perspective. Much of this work has already been accomplished for JIT code cache, JIT data cache and VM class memory.
There is still a substantial amount of memory that is unaccounted for, this is predominantly driven by memory allocations done in JCL natives. The main challenge is that JCL natives are using APIs like malloc/free directly so there is no way for us to track the usage like we do with j9mem_allocate_memory which adds callsite info that can be tracked. Furthermore, because we cant track these allocations, it will be impossible for us to disclaim this memory.
1) Use a standard tool like valgrind to get an understaning of what contributes to the memory footprint. We are really only concerned with RSS, and we would like to know what this looks like at first response for a Liberty restcrud/pringperf application (https://github.ibm.com/wasperf/restCrud, https://github.ibm.com/wasperf/pingperf). Ideally, this would be done with checkpoint/restore but if there are issues getting this to work with valgrind then a normal run will suffice.
2) Attempt to install a custom malloc/free to track allocations by adding an allocator lib to the LD_PRELOAD path.
3) Once we can track allocations then we can attempt to disclaim malloc'ed memory periodically. We will need a mechanism to register all malloc'ed callsites then add them to a list. Then we will need a function that runs through the list and attempts to disclaim (ie.
madvise(PAGE_OUT,...)
), similar to https://github.com/eclipse-openj9/openj9/pull/19609/files#diff-8b0a325c36e4697805c60ce54cd6e06ea8ce46d3cb26659b4b84e509ab2a6e68R463 The JIT already has a mechanism for perdiodic disclaim so we can piggy-back on this (HookedByTheJIT::memoryDisclaimLogic
around wheredisclaimDataCaches(crtElapsedTime);
is called).