A command line tool that parses Java garbage collection logging and does analysis to support JVM tuning and troubleshooting for OpenJDK and Sun/Oracle JDK. It differs from other tools in that it goes beyond the simple math of calculating statistics such as maximum pause time and throughput. It analyzes collectors, triggers, JVM version, JVM options, and OS information and reports error/warn/info level analysis and recommendations.
OpenJDK derivatives:
JDK5 - JDK8:
-XX:+PrintGC -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime
JDK8:
1) datestamp:
-XX:+PrintGC -XX:+PrintGCDetails -XX:-PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -Xloggc:gc_%p_%t.log -XX:+UseGCLogFileRotation -XX:GCLogFileSize=50M -XX:NumberOfGCLogFiles=4
2021-10-08T20:22:22.788-0600: [GC (Allocation Failure) [PSYoungGen: 328070K->55019K(503808K)] 649122K->396284K(1078272K), 0.3093583 secs] [Times: user=0.43 sys=0.12, real=0.31 secs]
2) timestamp:
-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -Xloggc:gc_%p_%t.log -XX:+UseGCLogFileRotation -XX:GCLogFileSize=50M -XX:NumberOfGCLogFiles=4
668.806: [GC (Allocation Failure) [PSYoungGen: 328070K->55019K(503808K)] 649122K->396284K(1078272K), 0.3093583 secs] [Times: user=0.43 sys=0.12, real=0.31 secs]
3) datestamp: timestamp:
-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -Xloggc:gc_%p_%t.log -XX:+UseGCLogFileRotation -XX:GCLogFileSize=50M -XX:NumberOfGCLogFiles=4
2021-10-08T20:22:22.788-0600: 668.806: [GC (Allocation Failure) [PSYoungGen: 328070K->55019K(503808K)] 649122K->396284K(1078272K), 0.3093583 secs] [Times: user=0.43 sys=0.12, real=0.31 secs]
JDK9+:
1) [time]
-Xlog:gc*,safepoint=info:file=gc_%p_%t.log:time:filecount=4,filesize=50M
[2020-02-14T15:21:55.207-0500] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
2) [uptime]
-Xlog:gc*,safepoint=info:file=gc_%p_%t.log:uptime:filecount=4,filesize=50M
[0.052s] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
3) [uptimemillis]
-Xlog:gc*,safepoint=info:file=gc_%p_%t.log:uptimemillis:filecount=4,filesize=50M
[052ms] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
4) [time][uptime]
-Xlog:gc*,safepoint=info:file=gc_%p_%t.log:time,uptime:filecount=4,filesize=50M
[2020-02-14T15:21:55.207-0500][0.052s] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
5) [time][uptimemillis]
-Xlog:gc*,safepoint=info:file=gc_%p_%t.log:time,uptimemillis:filecount=4,filesize=50M
[2020-02-14T15:21:55.207-0500][052ms] GC(0) Pause Young (Normal) (G1 Evacuation Pause)
Note: The Fedora and RHEL installs are release dependent. To ensure you have the latest code, build garbagecat.
# dnf copr enable bostrt/garbagecat
# dnf install garbagecat
# yum install yum-plugin-copr --enablerepo=rhel-7-server-optional-rpms
# yum copr enable bostrt/garbagecat
# yum install garbagecat
# yum install dnf-plugins-core
# yum copr enable bostrt/garbagecat
# yum install garbagecat
Put the YUM repo file into your /etc/yum/repos.d/:
$ docker run --pull=always -v "$PWD":/home/garbagecat/files:z ghcr.io/mgm3746/garbagecat:main --help
$ docker run --pull=always -v "$PWD":/home/garbagecat/files:z ghcr.io/mgm3746/garbagecat:main --console /home/garbagecat/files/src/test/gc-example.log > report.txt
NOTES:
/home/garbagecat/files
directory.--console
.Download the latest Maven: http://maven.apache.org/download.html.
Copy the download to where you want to install it and unzip it. For example:
$ cp ~/Downloads/apache-maven-3.6.3-bin.tar.gz ~/opt/
$ cd ~/opt/
$ tar -xvzf apache-maven-3.6.3-bin.tar.gz
$ rm apache-maven-3.6.3-bin.tar.gz
Get source:
$ git clone https://github.com/mgm3746/garbagecat.git
Build it:
$ cd garbagecat
$ /opt/apache-maven-3.6.3/bin/mvn clean (rebuilding)
$ /opt/apache-maven-3.6.3/bin/mvn --settings settings.xml package
$ export JAVA_HOME=/usr/lib/jvm/java/ (or wherever a JDK is installed)
$ /opt/apache-maven-3.6.3/bin/mvn --settings settings.xml javadoc:javadoc
If you get the following error:
org.apache.maven.surefire.booter.SurefireExecutionException: TestCase; nested exception is java.lang.NoClassDefFoundError: TestCase
Run the following command:
$ /opt/apache-maven-3.6.3/bin/mvn -U -fn clean install
$ java -jar garbagecat.jar --help
usage: garbagecat [OPTION]... [FILE]
-c,--console print report to stdout instead of file
-h,--help help
-j,--jvmoptions <arg> JVM options used during JVM run
-o,--output <arg> output file name (default report.txt)
-p,--preprocess do preprocessing
-r,--reorder reorder logging by timestamp
-s,--startdatetime <arg> JVM start datetime (yyyy-MM-dd HH:mm:ss.SSS)
to convert uptime to datestamp
-t,--threshold <arg> threshold (0-100) for throughput bottleneck
reporting
-v,--verbose verbose output
Notes:
garbagecat --help
.https://github.com/mgm3746/garbagecat/tree/master/src/test/gc-example.log
$ java -jar garbagecat.jar /path/to/garbagecat/src/test/gc-example.log
gc-example.log
=======================================================================
JVM:
-----------------------------------------------------------------------
Version: Java HotSpot(TM) 64-Bit Server VM (25.102-b14) for linux-amd64 JRE (1.8.0_102-b14), built on Jun 22 2016 18:43:17 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Options: -XX:CMSInitiatingOccupancyFraction=80 -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -XX:+DoEscapeAnalysis -XX:ErrorFile=/home/jbcures/errors/hs_err_pid%p.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/jboss/app-files/crash/cures/ -XX:InitialHeapSize=8589934592 -XX:MaxHeapSize=8589934592 -XX:MaxNewSize=174485504 -XX:MaxTenuringThreshold=6 -XX:NewSize=174485504 -XX:OldPLABSize=16 -XX:OldSize=348971008 -XX:ParallelGCThreads=2 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
Memory: Memory: 4k page, physical 32878232k(18756444k free), swap 4194300k(4194300k free)
=======================================================================
SUMMARY:
-----------------------------------------------------------------------
Datestamp First: 2016-10-10T18:43:49.025-0700
Timestamp First: 1.362 secs
Datestamp Last: 2016-10-11T12:34:47.720-0700
Timestamp Last: 64260.057 secs
# GC Events: 36586
Event Types: PAR_NEW, CMS_INITIAL_MARK, CMS_CONCURRENT, CMS_REMARK, CMS_SERIAL_OLD
# Parallel Events: 36585
# Inverted Parallelism: 2
Inverted Parallelism Max: 2016-10-11T08:06:39.037-0700: 48171.374: [GC (CMS Initial Mark) [1 CMS-initial-mark: 6578959K(8218240K)] 6599305K(8371584K), 0.0118105 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]
# Serial Events: 1
NewRatio: 54
Heap Used Max: 7092037K
Heap After GC Max: 6988066K
Heap Allocation Max: 8371584K
Metaspace Used Max: 167164K
Metaspace After GC Max: 167164K
Metaspace Allocation Max: 1204224K
GC Throughput: 96%
GC Pause Max: 7.528 secs
GC Pause Total: 2626.403 secs
=======================================================================
ANALYSIS:
-----------------------------------------------------------------------
error
-----------------------------------------------------------------------
*The CMS_SERIAL_OLD collector is being invoked for one of the following reasons: (1) Fragmentation. The concurrent low pause collector does not compact. When fragmentation becomes an issue a serial collection compacts the heap. If the old generation has available space, the cause is likely fragmentation. Fragmentation can be avoided by increasing the heap size. (2) Metaspace class metadata or compressed class pointers allocation failure. The GC attempts to free/resize metaspace. (3) Resizing perm gen. If perm gen occupancy is near perm gen allocation, the cause is likely perm gen. Perm gen resizing can be avoided by setting the minimum perm gen size equal to the the maximum perm gen size. For example: -XX:PermSize=256M -XX:MaxPermSize=256M. (4) Undetermined reasons. Possibly the JVM requires a certain amount of heap or combination of resources that is not being met, and consequently the concurrent low pause collector is not used despite being specified with the -XX:+UseConcMarkSweepGC option. The CMS_SERIAL_OLD collector is a serial (single-threaded) collector, which means it can take a very long time to collect a large heap. For optimal performance, tune to avoid serial collections.
*CMS promotion failed. A young generation collection is not able to complete because there is not enough space in the old generation for promotion. The old generation has available space, but it is not contiguous. When fragmentation is an issue, the concurrent low pause collector invokes a slow (single-threaded) serial collector to compact the heap. Tune to avoid fragmentation: (1) Increase the heap size. (2) Use -XX:CMSInitiatingOccupancyFraction=N (default 92) to run the CMS cycle more frequently to increase sweeping of dead objects in the old generation to free lists (e.g. -XX:CMSInitiatingOccupancyFraction=85 -XX:+UseCMSInitiatingOccupancyOnly). (3) Do heap dump analysis to determine if there is unintended object retention that can be addressed to decrease heap demands. Or move to a collector that handles fragmentation more efficiently: (1) G1 compacts the young and old generations during evacuation using a multi-threaded collector. (2) Shenandoah compacts concurrently. Temporarily add -XX:PrintFLSStatistics=1 and -XX:+PrintPromotionFailure to get additional insight into fragmentation.
-----------------------------------------------------------------------
warn
-----------------------------------------------------------------------
*CMS remark low parallelism: (1) If using JDK7 or earlier, add -XX:+CMSParallelRemarkEnabled, as remark is single-threaded by default in JDK7 and earlier. (2) Check if multi-threaded remark is disabled with -XX:-CMSParallelRemarkEnabled, and replace with -XX:+CMSParallelRemarkEnabled. (4) Add -XX:+ParallelRefProcEnabled to enable multi-threaded reference processing if the "weak refs processing" time is a significant amount of the total CMS remark time. (5) Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu.
*CMS initial mark low parallelism: (1) If using JDK6 or earlier, initial mark is single-threaded. Consider upgrading to JDK7 or later for multi-threaded initial mark. (2) If using JDK7, add -XX:+CMSParallelInitialMarkEnabled, as initial mark is single-threaded by default in JDK7. (3) Check if multi-threaded remark is disabled with -XX:-CMSParallelInitialMarkEnabled, and replace with -XX:+CMSParallelInitialMarkEnabled. (3) Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu.
*Application stopped time missing. Enable with -XX:+PrintGCApplicationStoppedTime (<= JDK8) or with safepoint logging at info level (e.g. -Xlog:gc*,safepoint=info:file=gc.log:uptime:filecount=4,filesize=50M). Required to determine overall throughput and identify throughput and pause issues not related to garbage collection, as many JVM operations besides garbage collection require all threads to reach a safepoint to execute. Reference: https://access.redhat.com/solutions/18656
*Inverted parallelism. With parallel (multi-threaded) collector events, the "user" + "sys" time should be approximately equal to the "real" (wall) time multiplied by the # of GC threads. For example, if there are 3 GC threads we would expect a parallel collection that takes 1 second of "real" time to take approximately 3 seconds of "user" + "sys" time. The parallelism is 3x. If the parallelism is 1x ("user" + "sys" = "real"), the parallel collection is not offering any efficiency over a serial (single-threaded) collection. When "user" + "sys" < "real", the parallelism is inverted. Inverted parallelism can be a sign of high i/o (e.g. disk or network access) or not enough CPU (e.g. GC threads competing with each other or other processes). Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu. Reference: https://access.redhat.com/solutions/159283.
*GCLocker GC due to the following sequence of events: (1) An object allocation failed due to not a large enough free area in the heap. (2) GC was triggered. (3) The GC could not run because another thread owned the GCLocker running JNI code in a "critical region". (4) The JVM requested a "GCLocker Initiated GC" and waited. (5) The JNI code exited the "critical region" and released the GCLocker. (6) The JVM triggered a "GCLocker Initiated GC" minor collection and rescheduled the thread attempting the object allocation. Possible next steps: (1) Increase the heap size to avoid the allocation failure. (2) Decrease the heap size to force more frequent collections to increase heap space free. (3) Move to the Shenandoah collector, which supports region pinning, so only the part of the heap containing the object passed into the JNI "critical region" is locked. (4) Migrate JNI criticals to safe alternatives (e.g. GetArrayElements, GetArrayRegion) or rewrite in Java. Reference: https://bugs.openjdk.org/browse/JDK-8199919.
*Explicit garbage collection is disabled with -XX:+DisableExplicitGC. The JVM uses explicit garbage collection to manage direct memory (to free space when MaxDirectMemorySize is reached) and the Remote Method Invocation (RMI) system (to clean up unreachable remote objects). Disabling it for those use cases can cause a memory leak. Verify the application does not use direct memory (e.g. java.nio.DirectByteBuffer), does not make remote method calls or export remote objects like EJBs (everything runs in the same JVM), and does not depend on explicit garbage collection in some other way. Known applications that use direct memory: JBoss EAP7 (IO subsystem).
*Consider enabling gc log file rotation (-XX:+UseGCLogFileRotation -XX:GCLogFileSize=N[K|M|G] -XX:NumberOfGCLogFiles=N) to protect disk space.
-----------------------------------------------------------------------
info
-----------------------------------------------------------------------
*The JDK is very old (7.6 years).
*Metaspace(unlimited) = Class Metadata(unlimited) + Compressed Class Space(1024M).
*-XX:+UseParNewGC is redundant (enabled by default) and can be removed. Deprecated in JDK8 and removed in JDK9.
*The number of times an object is copied between survivor spaces is set with -XX:MaxTenuringThreshold=N (0-15). 0 = disabled. 15 (default) = promote when the survivor space fills. Unless testing has shown this improves performance, consider removing this option to allow the default value to be applied.
*The number of parallel garbage collection threads is set with -XX:ParallelGCThreads=N. Unless multiple JVMs and/or processes are collocated and competing for resources, it's generally best to remove this option and let the JVM determine an appropriate setting (JDK8) or manage it dynamically with ergonomics (JDK11+).
*Consider enabling large page support, which can provide performance improvements with large heaps (> 4GB). Reference: https://access.redhat.com/solutions/22929.
=======================================================================
GC throughput less than 90%
-----------------------------------------------------------------------
2016-10-10T18:43:49.025-0700: 1.362: [GC (Allocation Failure) 2016-10-10T18:43:49.025-0700: 1.362: [ParNew: 136320K->8757K(153344K), 0.1186872 secs] 136320K->8757K(8371584K), 0.1190032 secs] [Times: user=0.34 sys=0.01, real=0.12 secs]
2016-10-10T18:43:50.538-0700: 2.875: [GC (Allocation Failure) 2016-10-10T18:43:50.538-0700: 2.875: [ParNew: 145077K->17024K(153344K), 0.1866956 secs] 145077K->24017K(8371584K), 0.1869439 secs] [Times: user=0.46 sys=0.03, real=0.19 secs]
...
=======================================================================
80 UNIDENTIFIED LOG LINE(S):
-----------------------------------------------------------------------
2016-10-10T19:17:37.771-0700: 2030.108: [GC (Allocation Failure) 2016-10-10T19:17:37.771-0700: 2030.108: [ParNew2016-10-10T19:17:37.773-0700: 2030.110: [CMS-concurrent-abortable-preclean: 0.050/0.150 secs] [Times: user=0.11 sys=0.03, real=0.15 secs]
...
Notes:
The bottom of the report shows 80 unidentified lines:
=======================================================================
80 UNIDENTIFIED LOG LINE(S):
-----------------------------------------------------------------------
2016-10-10T19:17:37.771-0700: 2030.108: [GC (Allocation Failure) 2016-10-10T19:17:37.771-0700: 2030.108: [ParNew2016-10-10T19:17:37.773-0700: 2030.110: [CMS-concurrent-abortable-preclean: 0.050/0.150 secs] [Times: user=0.11 sys=0.03, real=0.15 secs]
...
Run with the preprocess flag:
$ java -jar garbagecat.jar -p /path/to/garbagecat/src/test/gc-example.log
There are no unidentified log lines after preprocessing. However, there are many bottlenecks where throughput is < 90% (default). To get a better idea of bottlenecks, run with -t 50 to see stretches where more time is spent running gc/jvm threads than application threads.
$ java -jar garbagecat.jar -p -t 50 ~/path/to/garbagecat/src/test/gc-example.log
There are still a lot of bottlenecks reported; however, none are for a very long stretch. For example, the following lines show 3 gc events where more time was spent doing gc than running application threads, but it covers less than 1 second: 62339.274 --> 62339.684:
=======================================================================
Throughput less than 50%
-----------------------------------------------------------------------
...
2016-10-11T12:02:46.937-0700: 62339.274: [GC (Allocation Failure) 2016-10-11T12:02:46.938-0700: 62339.275: [ParNew: 152015K->17024K(153344K), 0.0754387 secs] 6994550K->6895910K(8371584K), 0.0760508 secs] [Times: user=0.15 sys=0.00, real=0.08 secs]
2016-10-11T12:02:47.089-0700: 62339.426: [GC (CMS Final Remark) [YG occupancy: 107402 K (153344 K)]2016-10-11T12:02:47.089-0700: 62339.426: [Rescan (parallel) , 0.0358321 secs]2016-10-11T12:02:47.125-0700: 62339.462: [weak refs processing, 0.0055111 secs]2016-10-11T12:02:47.131-0700: 62339.468: [class unloading, 0.1042745 secs]2016-10-11T12:02:47.235-0700: 62339.572: [scrub symbol table, 0.0822672 secs]2016-10-11T12:02:47.317-0700: 62339.654: [scrub string table, 0.0045528 secs][1 CMS-remark: 6878886K(8218240K)] 6986289K(8371584K), 0.2329748 secs] [Times: user=0.27 sys=0.00, real=0.24 secs]
2016-10-11T12:02:47.347-0700: 62339.684: [GC (Allocation Failure) 2016-10-11T12:02:47.347-0700: 62339.684: [ParNew: 153335K->17024K(153344K), 0.0889831 secs] 7032068K->6936404K(8371584K), 0.0894930 secs] [Times: user=0.17 sys=0.00, real=0.09 secs]
...
There are no localized throughput bottlenecks, and the summary shows good overall throughput and a relatively low max pause:
=======================================================================
SUMMARY:
-----------------------------------------------------------------------
Datestamp First: 2016-10-10T18:43:49.025-0700
Timestamp First: 1.362 secs
Datestamp Last: 2016-10-11T12:34:47.720-0700
Timestamp Last: 64260.057 secs
# GC Events: 36586
Event Types: PAR_NEW, CMS_INITIAL_MARK, CMS_CONCURRENT, CMS_REMARK, CMS_SERIAL_OLD
# Parallel Events: 36585
# Inverted Parallelism: 2
Inverted Parallelism Max: 2016-10-11T08:06:39.037-0700: 48171.374: [GC (CMS Initial Mark) [1 CMS-initial-mark: 6578959K(8218240K)] 6599305K(8371584K), 0.0118105 secs] [Times: user=0.01 sys=0.00, real=0.02 secs]
# Serial Events: 1
NewRatio: 54
Heap Used Max: 7092037K
Heap After GC Max: 6988066K
Heap Allocation Max: 8371584K
Metaspace Used Max: 167164K
Metaspace After GC Max: 167164K
Metaspace Allocation Max: 1204224K
GC Throughput: 96%
GC Pause Max: 7.528 secs
GC Pause Total: 2626.403 secs
=======================================================================
However, the summary shows the slow, single-threaded CMS_SERIAL_OLD collector was used. Based on the parallel vs. overall event numbers (36585 vs. 36586), it appears there was just one single event. The analysis provides guidance how to address this issue and other best practices:
=======================================================================
ANALYSIS:
-----------------------------------------------------------------------
error
-----------------------------------------------------------------------
*The CMS_SERIAL_OLD collector is being invoked for one of the following reasons: (1) Fragmentation. The concurrent low pause collector does not compact. When fragmentation becomes an issue a serial collection compacts the heap. If the old generation has available space, the cause is likely fragmentation. Fragmentation can be avoided by increasing the heap size. (2) Metaspace class metadata or compressed class pointers allocation failure. The GC attempts to free/resize metaspace. (3) Resizing perm gen. If perm gen occupancy is near perm gen allocation, the cause is likely perm gen. Perm gen resizing can be avoided by setting the minimum perm gen size equal to the the maximum perm gen size. For example: -XX:PermSize=256M -XX:MaxPermSize=256M. (4) Undetermined reasons. Possibly the JVM requires a certain amount of heap or combination of resources that is not being met, and consequently the concurrent low pause collector is not used despite being specified with the -XX:+UseConcMarkSweepGC option. The CMS_SERIAL_OLD collector is a serial (single-threaded) collector, which means it can take a very long time to collect a large heap. For optimal performance, tune to avoid serial collections.
*CMS promotion failed. A young generation collection is not able to complete because there is not enough space in the old generation for promotion. The old generation has available space, but it is not contiguous. When fragmentation is an issue, the concurrent low pause collector invokes a slow (single-threaded) serial collector to compact the heap. Tune to avoid fragmentation: (1) Increase the heap size. (2) Use -XX:CMSInitiatingOccupancyFraction=N (default 92) to run the CMS cycle more frequently to increase sweeping of dead objects in the old generation to free lists (e.g. -XX:CMSInitiatingOccupancyFraction=85 -XX:+UseCMSInitiatingOccupancyOnly). (3) Do heap dump analysis to determine if there is unintended object retention that can be addressed to decrease heap demands. Or move to a collector that handles fragmentation more efficiently: (1) G1 compacts the young and old generations during evacuation using a multi-threaded collector. (2) Shenandoah compacts concurrently. Temporarily add -XX:PrintFLSStatistics=1 and -XX:+PrintPromotionFailure to get additional insight into fragmentation.
-----------------------------------------------------------------------
warn
-----------------------------------------------------------------------
*CMS remark low parallelism: (1) If using JDK7 or earlier, add -XX:+CMSParallelRemarkEnabled, as remark is single-threaded by default in JDK7 and earlier. (2) Check if multi-threaded remark is disabled with -XX:-CMSParallelRemarkEnabled, and replace with -XX:+CMSParallelRemarkEnabled. (4) Add -XX:+ParallelRefProcEnabled to enable multi-threaded reference processing if the "weak refs processing" time is a significant amount of the total CMS remark time. (5) Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu.
*CMS initial mark low parallelism: (1) If using JDK6 or earlier, initial mark is single-threaded. Consider upgrading to JDK7 or later for multi-threaded initial mark. (2) If using JDK7, add -XX:+CMSParallelInitialMarkEnabled, as initial mark is single-threaded by default in JDK7. (3) Check if multi-threaded remark is disabled with -XX:-CMSParallelInitialMarkEnabled, and replace with -XX:+CMSParallelInitialMarkEnabled. (3) Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu.
*Application stopped time missing. Enable with -XX:+PrintGCApplicationStoppedTime (<= JDK8) or with safepoint logging at info level (e.g. -Xlog:gc*,safepoint=info:file=gc.log:uptime:filecount=4,filesize=50M). Required to determine overall throughput and identify throughput and pause issues not related to garbage collection, as many JVM operations besides garbage collection require all threads to reach a safepoint to execute. Reference: https://access.redhat.com/solutions/18656
*Inverted parallelism. With parallel (multi-threaded) collector events, the "user" + "sys" time should be approximately equal to the "real" (wall) time multiplied by the # of GC threads. For example, if there are 3 GC threads we would expect a parallel collection that takes 1 second of "real" time to take approximately 3 seconds of "user" + "sys" time. The parallelism is 3x. If the parallelism is 1x ("user" + "sys" = "real"), the parallel collection is not offering any efficiency over a serial (single-threaded) collection. When "user" + "sys" < "real", the parallelism is inverted. Inverted parallelism can be a sign of high i/o (e.g. disk or network access) or not enough CPU (e.g. GC threads competing with each other or other processes). Check for swapping and if the number of GC threads (-XX:ParallelGCThreads=<n>) is appropriate for the number of cpu/cores and any processes sharing cpu. Reference: https://access.redhat.com/solutions/159283.
*GCLocker GC due to the following sequence of events: (1) An object allocation failed due to not a large enough free area in the heap. (2) GC was triggered. (3) The GC could not run because another thread owned the GCLocker running JNI code in a "critical region". (4) The JVM requested a "GCLocker Initiated GC" and waited. (5) The JNI code exited the "critical region" and released the GCLocker. (6) The JVM triggered a "GCLocker Initiated GC" minor collection and rescheduled the thread attempting the object allocation. Possible next steps: (1) Increase the heap size to avoid the allocation failure. (2) Decrease the heap size to force more frequent collections to increase heap space free. (3) Move to the Shenandoah collector, which supports region pinning, so only the part of the heap containing the object passed into the JNI "critical region" is locked. (4) Migrate JNI criticals to safe alternatives (e.g. GetArrayElements, GetArrayRegion) or rewrite in Java. Reference: https://bugs.openjdk.org/browse/JDK-8199919.
*Explicit garbage collection is disabled with -XX:+DisableExplicitGC. The JVM uses explicit garbage collection to manage direct memory (to free space when MaxDirectMemorySize is reached) and the Remote Method Invocation (RMI) system (to clean up unreachable remote objects). Disabling it for those use cases can cause a memory leak. Verify the application does not use direct memory (e.g. java.nio.DirectByteBuffer), does not make remote method calls or export remote objects like EJBs (everything runs in the same JVM), and does not depend on explicit garbage collection in some other way. Known applications that use direct memory: JBoss EAP7 (IO subsystem).
*Consider enabling gc log file rotation (-XX:+UseGCLogFileRotation -XX:GCLogFileSize=N[K|M|G] -XX:NumberOfGCLogFiles=N) to protect disk space.
-----------------------------------------------------------------------
info
-----------------------------------------------------------------------
*The JDK is very old (7.6 years).
*Metaspace(unlimited) = Class Metadata(unlimited) + Compressed Class Space(1024M).
*-XX:+UseParNewGC is redundant (enabled by default) and can be removed. Deprecated in JDK8 and removed in JDK9.
*The number of times an object is copied between survivor spaces is set with -XX:MaxTenuringThreshold=N (0-15). 0 = disabled. 15 (default) = promote when the survivor space fills. Unless testing has shown this improves performance, consider removing this option to allow the default value to be applied.
*The number of parallel garbage collection threads is set with -XX:ParallelGCThreads=N. Unless multiple JVMs and/or processes are collocated and competing for resources, it's generally best to remove this option and let the JVM determine an appropriate setting (JDK8) or manage it dynamically with ergonomics (JDK11+).
*Consider enabling large page support, which can provide performance improvements with large heaps (> 4GB). Reference: https://access.redhat.com/solutions/22929.
=======================================================================
Copyright (c) 2008-2024 Mike Millson
All rights reserved. This program and the accompanying materials are made available under the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at http://www.eclipse.org/legal/epl-v10.html.