eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Intermittent OOM while running AcmeAir monolithic benchmark with containers #6603

Open kusumachalasani opened 5 years ago

kusumachalasani commented 5 years ago

OOM errors occurs intermittently when running the AcmeAir bench with containers and using memlimit. It uses no swap by using : -m '${memorylimit}' --memory-swap='${memorylimit}' --memory-swappiness=0

This behaviour is observed in both JDK8 and JDK11 on different set of machines.

From the experiments on a VM I used

OpenJDK8OpenJ9-0.11 shows no OOM with 300mb. OpenJDK8OpenJ9-0.14 shows OOM with 300mb. And even setting memlimit-500mb , there are OOMs.

mpirvu commented 5 years ago

@kusumachalasani Could you please post all the details about the experiment:

Thanks

kusumachalasani commented 5 years ago
  1. Docker command Used: docker run -d -p 80:80 -v log1:/output --cpus=2 -e MONGO_HOST='${dbhost1}' -e JVM_ARGS='${JVM_OPTIONS}' -m '${memorylimit}' --memory-swap='${memorylimit}' --memory-swappiness=0 '${acmeair_image}' ** JVM_OPTIONS="-Xshareclasses:none" ; memorylimit=300mb
  2. No heap settings are used.
  3. jmeter clients=100 , warmupruns=4 , measurement runs=2, warmuptime=300secs ; measurementtime=900secs.
  4. Liberty version:19.0.0.6 ; Java 11 OpenJ9 0.14
  5. No threadpool is configured.
mpirvu commented 5 years ago

I still cannot reproduce this issue. With no Xms set the container is using about 200-220MB at steady state and 300MB is enough to handle compilations. Trying to put more pressure on the container I used -Xms150M -Xmx150M and in this case the RSS at steady state is about 260-280MB. It comes close to exceeding the 300MB memory threshold, but the JIT backs-off and the JVM survives. For instance:

! com/acmeair/service/ServiceLocator.getService(Ljava/lang/Class;)Ljava/lang/Object; time=5075290us compilationHeapLimitExceeded memLimit=262144 KB freePhysicalMemory=18 MB mem=[region=49152 system=49152]KB

! com/ibm/ws/webcontainer/filter/WebAppFilterManager.invokeFilters(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Lcom/ibm/wsspi/webcontainer/servlet/IServletContext;Lcom/ibm/wsspi/webcontainer/RequestProcessor;Ljava/util/EnumSet;Lcom/ibm/wsspi/http/HttpInboundConnection;)Z time=3899757us compilationHeapLimitExceeded memLimit=262144 KB freePhysicalMemory=15 MB mem=[region=16384 system=16384]KB

! com/ibm/ws/webcontainer/session/impl/SessionAffinityManagerImpl.analyzeRequest(Ljavax/servlet/ServletRequest;)Lcom/ibm/wsspi/session/SessionAffinityContext; time=1930820us compilationHeapLimitExceeded memLimit=262144 KB freePhysicalMemory=9 MB mem=[region=16384 system=16384]KB

I have observed that when there are many JMeter clients it's more likely to get near the memory limit. One factor is the starvation mechanism implemented in the JIT: when there are many application threads the compilation threads don't get much time on the CPUs and the JIT launches even more compilation threads to keep a balance. With the JVM pinned to 2 cores there should be only one compilation thread, but due to starvation I see 4 or even 5 compilations threads being activated. I am thinking of amending this mechanism to be less aggressive if physical memory is running low.

mpirvu commented 5 years ago

The existing code already does what what I suggested above. When the JIT wants to activate a new compilation thread it will read the free physical memory and it will not activate another thread if the free physical memory is lower than safeReserveValue + scratchSpaceLowerBound, where scratchSpaceLowerBound is the minimum amount of memory we think a compilation thread is going to need (32MB) and safeReserveValue is what we want to reserve for other parts of the JVM (this is computed at start-up based on memory limit on the container 300/64 MB). The reason the JIT activates several compilation threads is that at the time these activations happen there is plenty of free physical memory (139 MB)

 (cold) Compiling com/ibm/ws/genericbnf/internal/GenericMessageImpl.parseLine(Lcom/ibm/wsspi/bytebuffer/WsByteBuffer;)Z  OrdinaryMethod j9m=0000000002F61090 t=43148 compThread=0 memLimit=262144 KB freePhysicalMemory=139 MB
#INFO:  t= 43398 Starvation status changed to 1 QWeight=3946 CompCPUUtil=23 CompThreadsActive=1
#INFO:  t= 43408 Activate compThread 1 Qweight=3952 active=2
#INFO:  t= 43408 Activate compThread 2 Qweight=3958 active=3
#INFO:  t= 43408 Activate compThread 3 Qweight=3964 active=4
 (cold) Compiling com/ibm/ws/http/channel/internal/HttpBaseMessageImpl.getServiceContext()Lcom/ibm/ws/http/channel/internal/HttpServiceContextImpl;  OrdinaryMethod j9m=0000000002F646A8 t=43408 compThread=2 memLimit=262144 KB freePhysicalMemory=137 MB

Later on, free physical memory becomes scarce and compilations start to get aborted and compilation threads to be suspended:

! org/apache/cxf/transport/http/AbstractHTTPDestination.setupMessage(Lorg/apache/cxf/message/Message;Ljavax/servlet/ServletConfig;Ljavax/servlet/ServletContext;Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V time=4193797us compilationHeapLimitExceeded memLimit=262144 KB freePhysicalMemory=16 MB mem=[region=49152 system=49152]KB
#INFO:  t=146508 Suspend compThread 1 Qweight=4178 active=3  LowPhysicalMem

I will try to lower the memory limit further to make the JVM hit the OOM. So far it's working as designed and no crashes are seen.

mpirvu commented 5 years ago

The benchmark still runs with a 275M limit, but hits the OOM case with 250M. This is expected as at steady state I see the JVM consuming 260-280M (when Xms==Xmx=150M).