GC pause time increases over the time

onlynishant commented 3 years ago

Hi,

I am using openj9 on our large-scale low latency (target p99 is < 50ms) bare metal production server. we receive 20-30k req/sec over the internet (HTTP/HTTPS) and the application creates a lot of short live objects and it also maintains a few GBs of internal process cache (multiple Java Map of total approx 3M entries) which get refreshed periodically and will be retained throughout of application life (continuous running process). We also use some libraries via JNI calls.

I am observing few issues -

the Java process RSS memory keeps growing 4-7GB every day
the GC pause increase over the period of time (garbage collection pauses % keeps increasing 2-3% every day)
I observe some very long latency spikes (a few sec)
I tried with the smaller Java heaps (8-16GB) also but it increases latency and with larger Java heaps (40GB) it improves for some time but starts to degrade slowly.
I tried using jemalloc also recently assuming that it could be a memory fragmentation issue. Although the rate of RSS memory growth decreased (now 3-4GB/Day approx) but it's still growing and has no impact on GC pause time.

I tried using different GC policies and other JVM flags but GC pause is not improving. I have attached verbose GC logs. Hopefully, it will help to check.

GC log link: https://drive.google.com/file/d/1ASBEKD2_pSDdVyY8emYiyxyMKSt3o4CF/view?usp=sharing

I hope someone here will be able to provide good insight.

JasonFengJ9 commented 3 years ago

Set comp:gc tentatively. fyi @dmitripivkine

dmitripivkine commented 3 years ago

Yes, we can help with this indeed

onlynishant commented 3 years ago

Hi Guys,

It's good to see that it has been added to the Oct milestone. Is there a patch or anything that I can apply now to solve or mitigate it? It would be very helpful because it's affecting our production servers.

pshipton commented 3 years ago

Adding to the Oct milestone is a way to keep track of this issue so it's not forgotten, and not a guarantee it will be fixed for any particular time.

dmitripivkine commented 3 years ago

We are looking to your issue. Meanwhile you can try to use -Xgc:concurrentSlack=macrofrag option. It suppose to improve heap macro fragmentation and it looks like as (at least one of) reason for regression in time

cedrichansen commented 3 years ago

@onlynishant Would you be able to provide additional GC logs? (using different GC policies, command line options, heap sizes, etc)

onlynishant commented 3 years ago

@dmitripivkine I am testing -Xgc:concurrentSlack=macrofrag. I will update you about the performance.

@cedrichansen Yes, I can provide you additional GC logs.

onlynishant commented 3 years ago

@dmitripivkine I don't see any improvement with -Xgc:concurrentSlack=macrofrag. There are higher latency spikes than earlier sometimes after this change.

new gc log - https://drive.google.com/file/d/1Op8tpkp5D-4uDLC4JrEZlA3mO46BkjlW/view?usp=sharing

dmitripivkine commented 3 years ago

There is one observation related to Phantom References in the system. Peak number of them is growing with the time obviously while clearing rate stays flat. This is not direct accumulation but some peaks like in example below:

  <references type="phantom" candidates="69761" cleared="59735" enqueued="59735" />
  <references type="phantom" candidates="70934" cleared="60579" enqueued="60579" />
  <references type="phantom" candidates="326344" cleared="12" enqueued="12" />
  <references type="phantom" candidates="327037" cleared="757" enqueued="757" />
  <references type="phantom" candidates="64230" cleared="54941" enqueued="54941" />
  <references type="phantom" candidates="68222" cleared="58695" enqueued="58695" />
  <references type="phantom" candidates="68706" cleared="59590" enqueued="59590" />

dmitripivkine commented 3 years ago

Also would you please disable Concurrent Scavenger to check is the problem CS specific?

onlynishant commented 3 years ago

@dmitripivkine I have tried without concurrentScavenger and the mean gc pause time has increased however it is consistent over time. I still see Peak Phantom References is increasing over time.

Latest GC log: https://drive.google.com/file/d/18CS8x73MggMx1-sUyU-zc71vGk8K0Ffw/view?usp=sharing

dmitripivkine commented 3 years ago

Thank you.

I don't think peak Phantom References might be GC related issue (at least directly) - as you see GC seems clearing Phantom References properly, there is no accumulation observed. However around peak time references can not be cleared (stay alive I guess) for a while before processing returns back to normal. I would expect there is some Application behaviour here. It is not clear why we observe higher and higher peaks over time.

dmitripivkine commented 3 years ago

Would you please provide java -version you are running?

onlynishant commented 3 years ago

# We use this version in production
$] /usr/java/jdk-11.0.9+11/bin/java -version
openjdk version "11.0.9" 2020-10-20
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.23.0, JRE 11 Linux amd64-64-Bit Compressed References 20201022_810 (JIT enabled, AOT enabled)
OpenJ9   - 0394ef754
OMR      - 582366ae5
JCL      - 3b09cfd7e9 based on jdk-11.0.9+11)

# Default Java
$] java -version
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

We are using our java process under supervisor. The last gc log was using jemalloc.

LD_PRELOAD=/usr/local/lib/libjemalloc.so,MALLOC_CONF="background_thread:true"
command=/usr/java/jdk-11.0.9+11/bin/java -server - ...

onlynishant commented 3 years ago

@dmitripivkine Update on yesterday's log without concurrentScavenger.

I can see that pause time has started to increase.

FYI the process is still running without concurrentScavenger

Updated gc log https://drive.google.com/file/d/1qYJw4pph0ankgRclYTQ_p0rJ0j_7iJNA/view?usp=sharing

I am also adding a memory snapshot of the java dump that I took last week after running the process for 40 hours in production. just in case if it helps you to identify issues faster.

...
...
CIUSERARG               -Djava.library.path=/usr/java/jdk-11.0.9+11/lib/compressedrefs:/usr/java/jdk-11.0.9+11/lib:/usr/lib64:/usr/lib
2CIUSERARG               -Djava.home=/usr/java/jdk-11.0.9+11
2CIUSERARG               -Duser.dir=/home/platform/platform-java
2CIUSERARG               -XX:MaxDirectMemorySize=2g
2CIUSERARG               -Xdump:heap:events=user,label=/home/platform/platform-java/user-heapdump.phd
2CIUSERARG               -Xshareclasses:name=JavaSC2,cacheDir=/home/platform/platform-java/,nonfatal
2CIUSERARG               -Xms32g
2CIUSERARG               -Xmx32g
2CIUSERARG               -Xmn25g
2CIUSERARG               -Xgc:concurrentScavenge
2CIUSERARG               -Xgc:scvNoAdaptiveTenure
2CIUSERARG               -Xgc:scvTenureAge=14
2CIUSERARG               -verbose:gc
...
...
NULL           ------------------------------------------------------------------------
0SECTION       NATIVEMEMINFO subcomponent dump routine
NULL           =================================
0MEMUSER
1MEMUSER       JRE: 39,065,651,192 bytes / 105966 allocations
1MEMUSER       |
2MEMUSER       +--VM: 37,798,839,000 bytes / 89926 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Classes: 2,254,231,880 bytes / 13483 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Shared Class Cache: 2,147,483,736 bytes / 2 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Other: 106,748,144 bytes / 13481 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Modules: 1,944,424 bytes / 9080 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Memory Manager (GC): 35,047,221,880 bytes / 4301 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Java Heap: 34,359,799,808 bytes / 1 allocation
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Other: 687,422,072 bytes / 4300 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Threads: 97,076,456 bytes / 1736 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Java Stack: 7,658,592 bytes / 304 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Native Stack: 86,114,304 bytes / 305 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Other: 3,303,560 bytes / 1127 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Trace: 2,920,632 bytes / 1860 allocations
2MEMUSER       |  |
3MEMUSER       |  +--JVMTI: 17,776 bytes / 13 allocations
2MEMUSER       |  |
3MEMUSER       |  +--JNI: 214,159,960 bytes / 55967 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Port Library: 177,080,704 bytes / 129 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Unused <32bit allocation regions: 177,060,512 bytes / 1 allocation
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Other: 20,192 bytes / 128 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Other: 4,185,288 bytes / 3357 allocations
1MEMUSER       |
2MEMUSER       +--JIT: 347,301,064 bytes / 12356 allocations
2MEMUSER       |  |
3MEMUSER       |  +--JIT Code Cache: 268,435,456 bytes / 1 allocation
2MEMUSER       |  |
3MEMUSER       |  +--JIT Data Cache: 25,166,592 bytes / 12 allocations
2MEMUSER       |  |
3MEMUSER       |  +--Other: 53,699,016 bytes / 12343 allocations
1MEMUSER       |
2MEMUSER       +--Class Libraries: 919,511,128 bytes / 3684 allocations
2MEMUSER       |  |
3MEMUSER       |  +--VM Class Libraries: 919,511,128 bytes / 3684 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--sun.misc.Unsafe: 919,395,184 bytes / 3652 allocations
4MEMUSER       |  |  |  |
5MEMUSER       |  |  |  +--Direct Byte Buffers: 563,909,424 bytes / 3166 allocations
4MEMUSER       |  |  |  |
5MEMUSER       |  |  |  +--Other: 355,485,760 bytes / 486 allocations
3MEMUSER       |  |  |
4MEMUSER       |  |  +--Other: 115,944 bytes / 32 allocations
NULL           
NULL           ------------------------------------------------------------------------

onlynishant commented 3 years ago

Hi @dmitripivkine, I have made good progress identifying the suspect. We were using SSLContext via OpenSSL bindings for Java opensource lib for high performance and once I replaced it with Java native version, I see no deterioration in performance.

Here is the latest GC log for 24+ hours -

https://drive.google.com/file/d/1kSkmvNSZKwKSW6O-veiLNmrRqaut9fVr/view?usp=sharing

I have observed that there is a significant drop in overall GC pause % but max GC paused time has increased a lot. Can you please share some insight on how can I reduce it?

I see that most of the spikes overlap with spikes in mark time. However, I am not very sure about it.

Screenshot 2021-10-01 at 8 23 08 PM

amicic commented 2 years ago

Those are aborted scavenges that cause global GCs with compaction to recover. Aborted scavenge occur when there is no room to move objects neither in Nursery (for example due to overly optimistic tilting) nor in Tenure (too late concurrent kickoff) Without going into details they occur because of high variability in application (from GC perspective, like allocation and survival rate), so that tilting and kickoff heuristics cannot make accurate decisions.

Some tuning you can try:

increase % headroom for concurrent global GC kickoff with: -Xgc:concurrentKickoffTenuringHeadroom=10 (default is 2, max 100; some other reasonable value to try is 20)
increase headroom (slack) for Nursery tilting. We don't have a option for relative (%) adjustment, but only absolute (in bytes). Reasonable value to try is 5-10% of Nursery (default is 0, max is 50%), what amounts to 1-2GB: -Xgc:concurrentScavengeSlack=1000000000

dmitripivkine commented 2 years ago

Moved to 0.32 because this is not specific for Java 18

eclipse-openj9 / openj9

GC pause time increases over the time #13349