Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Customer running under FSD on Power with an agent reported 'hangs'. Investigation discovered that the system was just running very slowly, with all methods running in the interpreter after the agent late attaches and performs a few passes of class redefinition. On redef under FSD the code cache is cleared and all segments that are allocated at that time subsequently get marked as full due to lack of trampoline space. After the cache is emptied the segments are supposed to be empty, but we incorrectly reset the trampoline pointers to the end position instead of the start position. The problem was exacerbated because there is documentation online recommending the use of 32MB segments which the customer had followed. This made it more likely that all of the segments had already been allocated at the time the agent attached and redefined classes. As a result we emptied the code cache, marked all segments full and were subsequently unable to recompile any code.
Diagnostic files
The problem can be readily reproduced on a Power system with a suitable agent installed. Enabling jit verbose logging with hooks, codecache and compilePerf* options shows a sequence of
#HK: vmThread=00000000004BB100 hook jitClassesRedefined
#CODECACHE: CodeCache 00007D4FB400AA30 marked as full in reserveSpaceForTrampoline
#PERF: t= 9608 <WARNING: JIT CACHES FULL> Disable further compilation
Using a standard testcase where higher numbers are better shows an extreme case of the benefit of this change:
Java -version output
pap6480sr8fp30-20240801_01
Summary of problem
Customer running under FSD on Power with an agent reported 'hangs'. Investigation discovered that the system was just running very slowly, with all methods running in the interpreter after the agent late attaches and performs a few passes of class redefinition. On redef under FSD the code cache is cleared and all segments that are allocated at that time subsequently get marked as full due to lack of trampoline space. After the cache is emptied the segments are supposed to be empty, but we incorrectly reset the trampoline pointers to the end position instead of the start position. The problem was exacerbated because there is documentation online recommending the use of 32MB segments which the customer had followed. This made it more likely that all of the segments had already been allocated at the time the agent attached and redefined classes. As a result we emptied the code cache, marked all segments full and were subsequently unable to recompile any code.
Diagnostic files
The problem can be readily reproduced on a Power system with a suitable agent installed. Enabling jit verbose logging with hooks, codecache and compilePerf* options shows a sequence of
Using a standard testcase where higher numbers are better shows an extreme case of the benefit of this change:
Before change:
After change: