eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Memory Leak when using IBM J11.0.14 Liberty L22.0.0.3 CICS 6.1 zOS 2.5 #15018

Closed SimonMcNab closed 2 years ago

SimonMcNab commented 2 years ago

Java -version output

java version "11.0.14.1" 2022-02-08
IBM Semeru Runtime Certified Edition for z/OS 11.0.14.1 (build 11.0.14.1+1)
IBM J9 VM 11.0.14.1 (build z/OS-Release-11.0.14.1-b01, JRE 11 z/OS s390x-64-Bit Compressed References 20220315_174 (JIT enabled, AOT enabled)
OpenJ9 - 251e7679cf3
OMR - 59845b76b8a
IBM - 3c151c1
JCL - 090eaebc57d based on jdk-11.0.14.1+1)

Summary of problem

When performance testing a workload with IBM J11.0.14 Liberty L22.0.0.3 CICS 6.1 zOS 2.5, we observe a steady increase in real storage during the life of the workload

We do not see any outOfMemory Error

Heap is around 90% free

This is not observed with java java version "1.8.0_321". All other test conditions are identical

Diagnostic files (NB these are EBCDIC)

javacores etc.zip

The files in the .zip were triggered manually ay 4 minute intervals

dmitripivkine commented 2 years ago

I will triage this

dmitripivkine commented 2 years ago

From javacores Native memory sections there is grows for JVM (for four sequential snapshots):

1MEMUSER       JRE: 2,018,238,271 bytes / 505915 allocations
1MEMUSER       JRE: 2,081,218,615 bytes / 569520 allocations <--- +62,980,344
1MEMUSER       JRE: 2,128,697,935 bytes / 616789 allocations <--- +47,479,320
1MEMUSER       JRE: 2,174,964,967 bytes / 662768 allocations <--- +46,267,032

With slight variations on different parts of JVM looks like the point of grows is JNI:

3MEMUSER       |  +--JNI: 489,585,592 bytes / 488994 allocations
3MEMUSER       |  +--JNI: 553,793,864 bytes / 552843 allocations <--- +64,208,272
3MEMUSER       |  +--JNI: 601,267,928 bytes / 600109 allocations <--- +47,474,064
3MEMUSER       |  +--JNI: 647,479,792 bytes / 646073 allocations <--- +46,211,864

@pshipton @tajila Would you please take a look and advice next steps?

pshipton commented 2 years ago

The app can be run with -Xcheck:memory:quick,noscan,callsite=99999, which outputs to stderr a summary of the memory allocations every 99999 allocations. The 99999 is arbitrary and can be modified to a bigger number if there is too much output, or a smaller number if there is not enough. By comparing the memory allocation summaries we can determine what memory allocations are increasing.

There isn't too much that allocates memory in the JNI category. The main files are jvminit.c, jnicsup.c, jnimem.c. The file name and line number shows up in the memory allocation summary. https://github.com/eclipse-openj9/openj9/search?q=J9MEM_CATEGORY_JNI I see:

SimonMcNab commented 2 years ago

D20220509.T130527.dfhjvmerr.zip

Test with -Xcheck:memory:quick,noscan,callsite=99999 as requested

pshipton commented 2 years ago

jnimem.c:43 is climbing. Each line below is a different point in time. The first is the first one, the last is the last one, and the others in the middle are arbitrarily points in time. https://github.com/eclipse-openj9/openj9/blob/master/runtime/vm/jnimem.c#L43

-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------------
 total alloc   | total freed   | delta alloc   | delta freed   | high water | largest
 blocks| bytes | blocks| bytes | blocks| bytes | blocks| bytes | blocks| bytes | bytes | num   | callsite
-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------------
     43    2562      14     765      43    2562      14     765      31    2013     192      38 /u/jenkins/workspace/Build_JDK11_s390x_zos_Release/openj9/runtime/vm/jnimem.c:43
 395839 30745447  393199 28551425   92729 7010831   92437 6734096    2669 2199782     976     129 /u/jenkins/workspace/Build_JDK11_s390x_zos_Release/openj9/runtime/vm/jnimem.c:43
2081853 157576865 2074275 150733587   93830 7085538   93519 6791192    7611 6850654     976     129 /u/jenkins/workspace/Build_JDK11_s390x_zos_Release/openj9/runtime/vm/jnimem.c:43
42374209 3188979893 42248169 3070975834   93966 7097779   93656 6802016  126070 118010523     976     129 /u/jenkins/workspace/Build_JDK11_s390x_zos_Release/openj9/runtime/vm/jnimem.c:43
joransiu commented 2 years ago

@SimonMcNab : Is TLS enabled on this Liberty server instance?

@alon-sh, any chance this could be related to the Digest memleak?

pshipton commented 2 years ago

Seems we may already know what this is, but likely there is some JNI code calling a function that allocates memory, and the JNI code is not freeing it. Such as GetStringChars() (should call ReleaseStringChars()), GetStringUTFChars() (should call ReleaseStringUTFChars()), GetXArrayElements() where X is Boolean, Byte, Char, etc. (should call ReleaseXArrayElements()).

pshipton commented 2 years ago

-Xcheck:jni should trace the memory leak.

alon-sh commented 2 years ago

@joransiu its possible - another memory leak fix was merged a few days ago (for AES). try using latest build of jceplus see if it helps ...

SimonMcNab commented 2 years ago

J11_Xcheck_jni.zip

Looks like https://www.ibm.com/support/pages/apar/PH45122

server.xml has

ssl-1.0 transportSecurity-1.0

I'll test with the latest ibm internal J11 build

pshipton commented 2 years ago

I'll go ahead and close this then.