eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

JRE 1.8 z/OS SR8 FP11 OOM with lots of heap space left #19948

Closed roded closed 3 weeks ago

roded commented 1 month ago

Java -version output

1CIJAVAVERSION JRE 1.8.0 z/OS s390x-64 (build 8.0.8.11 - pmz6480sr8fp11-20230901_01(SR8 FP11))
1CIVMVERSION   20230817_56476
1CIJ9VMVERSION c425e1c
1CIJITVERSION  tr.open_20230817_152504_c425e1c
1CIOMRVERSION  e712c65_CMPRSS
1CIIBMVERSION  696e9df

Summary of problem

JVM is running with a Xmx of 6GB. User is reporting an OOM for which we have the javacore dump (but can't currently share).

Heap space is fine:

1STHEAPTYPE    Object Memory
...
1STHEAPTOTAL   Total memory:                  6442450944 (0x0000000180000000)
1STHEAPINUSE   Total memory in use:           2502113864 (0x0000000095233A48)
1STHEAPFREE    Total memory free:             3940337080 (0x00000000EADCC5B8)

As are other segment types (i.e., they have free space available).

However the JIT Data Cache seems exhausted? But I'm not sure if Total memory free is always 0 for the JIT Data Cache.

1STSEGTYPE     JIT Data Cache
...
1STSEGTOTAL    Total memory:                    14680064 (0x0000000000E00000)
1STSEGINUSE    Total memory in use:             14680064 (0x0000000000E00000)
1STSEGFREE     Total memory free:                      0 (0x0000000000000000)
1STSEGLIMIT    Allocation limit:               402653184 (0x0000000018000000)

Any pointers would be much appreciated. Thank you.

Diagnostic files

1STGCHTYPE     GC History
3STHSTTYPE     03:15:30:138174312 GMT j9mm.134 -   Allocation failure end: newspace=0/0 oldspace=5745756904/6442450944 loa=2666420360/3221225472
3STHSTTYPE     03:15:30:134077747 GMT j9mm.470 -   Allocation failure cycle end: newspace=0/0 oldspace=5745756904/6442450944 loa=2666420360/3221225472
3STHSTTYPE     03:15:30:133715956 GMT j9mm.475 -   GlobalGC end: workstackoverflow=0 overflowcount=0 memory=5745756904/6442450944
3STHSTTYPE     03:15:30:133602875 GMT j9mm.90 -   GlobalGC collect complete
3STHSTTYPE     03:15:30:133525960 GMT j9mm.57 -   Sweep end
3STHSTTYPE     03:15:30:129619144 GMT j9mm.56 -   Sweep start
3STHSTTYPE     03:15:30:129616346 GMT j9mm.55 -   Mark end
3STHSTTYPE     03:15:30:112262776 GMT j9mm.54 -   Mark start
3STHSTTYPE     03:15:30:111911221 GMT j9mm.474 -   GlobalGC start: globalcount=1718
3STHSTTYPE     03:15:30:111843307 GMT j9mm.135 -   Exclusive access: exclusiveaccessms=0.020 meanexclusiveaccessms=0.020 threads=0 lastthreadtid=0x0000000037EB2EF8 beatenbyotherthread=0
3STHSTTYPE     03:15:30:111842847 GMT j9mm.469 -   Allocation failure cycle start: newspace=0/0 oldspace=627478544/6442450944 loa=200786888/3221225472 requestedbytes=67108872
3STHSTTYPE     03:15:30:111842100 GMT j9mm.133 -   Allocation failure start: newspace=0/0 oldspace=627478544/6442450944 loa=200786888/3221225472 requestedbytes=67108872
3STHSTTYPE     03:15:27:051232262 GMT j9mm.134 -   Allocation failure end: newspace=0/0 oldspace=5558928072/6442450944 loa=2410468360/3221225472
3STHSTTYPE     03:15:27:051228918 GMT j9mm.470 -   Allocation failure cycle end: newspace=0/0 oldspace=5558928072/6442450944 loa=2410468360/3221225472
3STHSTTYPE     03:15:27:051194078 GMT j9mm.475 -   GlobalGC end: workstackoverflow=0 overflowcount=0 memory=5558928072/6442450944
3STHSTTYPE     03:15:27:051154568 GMT j9mm.90 -   GlobalGC collect complete
3STHSTTYPE     03:15:27:051102593 GMT j9mm.57 -   Sweep end
3STHSTTYPE     03:15:27:047282948 GMT j9mm.56 -   Sweep start
3STHSTTYPE     03:15:27:047279445 GMT j9mm.55 -   Mark end
3STHSTTYPE     03:15:27:031206322 GMT j9mm.54 -   Mark start
3STHSTTYPE     03:15:27:030835769 GMT j9mm.474 -   GlobalGC start: globalcount=1717
3STHSTTYPE     03:15:27:030760809 GMT j9mm.135 -   Exclusive access: exclusiveaccessms=0.033 meanexclusiveaccessms=0.021 threads=1 lastthreadtid=0x0000000037EA4EF8 beatenbyotherthread=0
3STHSTTYPE     03:15:27:030760187 GMT j9mm.469 -   Allocation failure cycle start: newspace=0/0 oldspace=162930232/6442450944 loa=162930232/3221225472 requestedbytes=120
3STHSTTYPE     03:15:27:030759088 GMT j9mm.133 -   Allocation failure start: newspace=0/0 oldspace=162930232/6442450944 loa=162930232/3221225472 requestedbytes=120
3STHSTTYPE     03:15:22:942942715 GMT j9mm.134 -   Allocation failure end: newspace=0/0 oldspace=6037076976/6442450944 loa=2943251960/3221225472
3STHSTTYPE     03:15:22:939790089 GMT j9mm.470 -   Allocation failure cycle end: newspace=0/0 oldspace=6037076976/6442450944 loa=2943251960/3221225472
3STHSTTYPE     03:15:22:939754346 GMT j9mm.475 -   GlobalGC end: workstackoverflow=0 overflowcount=0 memory=6037076976/6442450944
3STHSTTYPE     03:15:22:939682646 GMT j9mm.90 -   GlobalGC collect complete
3STHSTTYPE     03:15:22:939635166 GMT j9mm.57 -   Sweep end
3STHSTTYPE     03:15:22:936230519 GMT j9mm.56 -   Sweep start
3STHSTTYPE     03:15:22:936227505 GMT j9mm.55 -   Mark end
3STHSTTYPE     03:15:22:921332412 GMT j9mm.54 -   Mark start
3STHSTTYPE     03:15:22:921052682 GMT j9mm.474 -   GlobalGC start: globalcount=1716
3STHSTTYPE     03:15:22:920975067 GMT j9mm.135 -   Exclusive access: exclusiveaccessms=1.202 meanexclusiveaccessms=0.603 threads=1 lastthreadtid=0x000000003802A5F8 beatenbyotherthread=0
3STHSTTYPE     03:15:22:920974570 GMT j9mm.469 -   Allocation failure cycle start: newspace=0/0 oldspace=465183008/6442450944 loa=106007192/3221225472 requestedbytes=53477384
3STHSTTYPE     03:15:22:920973738 GMT j9mm.133 -   Allocation failure start: newspace=0/0 oldspace=465183008/6442450944 loa=106007192/3221225472 requestedbytes=53477384
3STHSTTYPE     03:15:19:847088411 GMT j9mm.134 -   Allocation failure end: newspace=0/0 oldspace=5521288480/6442450944 loa=2497505488/3221225472
3STHSTTYPE     03:15:19:842521262 GMT j9mm.470 -   Allocation failure cycle end: newspace=0/0 oldspace=5521288480/6442450944 loa=2497505488/3221225472
3STHSTTYPE     03:15:19:842485273 GMT j9mm.475 -   GlobalGC end: workstackoverflow=0 overflowcount=0 memory=5521288480/6442450944
3STHSTTYPE     03:15:19:842390564 GMT j9mm.90 -   GlobalGC collect complete
3STHSTTYPE     03:15:19:842286965 GMT j9mm.57 -   Sweep end
3STHSTTYPE     03:15:19:838547964 GMT j9mm.56 -   Sweep start
3STHSTTYPE     03:15:19:838545060 GMT j9mm.55 -   Mark end
3STHSTTYPE     03:15:19:822722664 GMT j9mm.54 -   Mark start
3STHSTTYPE     03:15:19:822338833 GMT j9mm.474 -   GlobalGC start: globalcount=1715
3STHSTTYPE     03:15:19:822267084 GMT j9mm.135 -   Exclusive access: exclusiveaccessms=7.110 meanexclusiveaccessms=3.558 threads=1 lastthreadtid=0x000000003800E5F8 beatenbyotherthread=0
3STHSTTYPE     03:15:19:822266576 GMT j9mm.469 -   Allocation failure cycle start: newspace=0/0 oldspace=465984848/6442450944 loa=79756376/3221225472 requestedbytes=67108872
3STHSTTYPE     03:15:19:822265795 GMT j9mm.133 -   Allocation failure start: newspace=0/0 oldspace=465984848/6442450944 loa=79756376/3221225472 requestedbytes=67108872
3STHSTTYPE     03:15:17:344375042 GMT j9mm.134 -   Allocation failure end: newspace=0/0 oldspace=5680230920/6442450944 loa=2673388888/3221225472
3STHSTTYPE     03:15:17:340099904 GMT j9mm.470 -   Allocation failure cycle end: newspace=0/0 oldspace=5680230920/6442450944 loa=2673388888/3221225472
3STHSTTYPE     03:15:17:340063475 GMT j9mm.475 -   GlobalGC end: workstackoverflow=0 overflowcount=0 memory=5680230920/6442450944
3STHSTTYPE     03:15:17:339965573 GMT j9mm.90 -   GlobalGC collect complete
3STHSTTYPE     03:15:17:339916272 GMT j9mm.57 -   Sweep end
3STHSTTYPE     03:15:17:335959151 GMT j9mm.56 -   Sweep start
3STHSTTYPE     03:15:17:335951630 GMT j9mm.55 -   Mark end
3STHSTTYPE     03:15:17:319779581 GMT j9mm.54 -   Mark start

OutOfMemoryError: Java Heap Space

N/A?

dmitripivkine commented 1 month ago

Hi @roded , Would you please confirm which GC policy is used? Based on newspace=0/0 I can guess optthruput or optavgpause. "Java Heap Space" reason is quite restrictive, the problem should be related to object heap usage directly. There is not enough data for investigation obviously. So I am speculating here. There is might be a case "ExcessiveGC" condition detected (potentially incorrectly). Excessive GC is as condition when GC operations take too much time (95%+ of total JVM times consumption) or produced too few free space (less than 5%). In this case OOM Java Heap Space is enforced even there is free space in the heap. GC is able to release a lot of space (from 627478544 to 5745756904 bytes) so it is not a latter one. We had problems in the past when ExcessiveGC can be detected incorrectly due inconsistent wall clock reading (like end of operation has a smaller time stamp than it's beginning - due clock adjustment during the operation). This issue has been fixed but might be we have something similar here. If you have Snap trace file (.trc) you can convert it to the text file java com.ibm.jvm.TraceFormat <input_file> end search for "ExcessiveGC". It would be helpful if you can share Snap trace file. Another potential reason might be low memory below 4GB bar (as far as this is Compressed Refs JVM). In this case you should see mentioning of this in a few lines on the top of javacore.

pshipton commented 1 month ago

Really this should be handled via IBM support. @paulcheeseman @manqingl fyi

roded commented 1 month ago

Thanks for the reply and explanations @dmitripivkine. GC policy is: -Xgcpolicy:optthruput. I don't see any mention of low below the bar storage at the top of the javacore, not entirely sure what to look for though. Let me check about the snap trace file.

paulcheeseman commented 1 month ago

Really this should be handled via IBM support.

Agreed. @roded please open a support case at https://www.ibm.com/mysupport/

roded commented 1 month ago

I've opened a case with IBM support. For future reference, could someone please explain when an IBM support case is relevant vs an issue in this repository? Thanks!

pshipton commented 3 weeks ago

IBM Java 8 is an IBM product that uses the OpenJ9 open source. This repo is used by OpenJ9 developers, not IBM support staff.

pshipton commented 3 weeks ago

Another point is that issues in this open source repo get free best effort responses, there are no guarantees for replies.

roded commented 3 weeks ago

Understood. Thank you.