High memory usage when projects increase

JustSong commented 5 years ago

Tomcat: apache-tomcat-8.5.37.tar.gz
Opengrok: opengrok-1.1.2.tar.gz

Here is our OpenGrok project structure, we set up each project as a different URL location:

source1 => https://opengrok.studio.com/source1
source2 => https://opengrok.studio.com/source2
source3 => https://opengrok.studio.com/source3
...
source8 => https://opengrok.studio.com/source8

We found in this version of Opengrok, the system will get high memory usage situation ... If we still increase new project (.war file) in <tomcat8>/webapps directory, the Tomcat will deploy error :

<tomcat8>/logs/catalina.out error msg:
1. Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded or
2. java.lang.OutOfMemoryError: Java heap space
Memory Status:
1. opengrok-1.1.2.tar.gz + tomcat8.5.37 + LDAP auth We set 12 projects almost used 10 GB RAM.
2. opengrok-1.1-rc11 + tomcat8.0.46 We set 36 projects but only used 4 GB RAM

Is there any setting I need to notice?

JustSong commented 5 years ago

After some test, it's maybe can add more projects after we add some argument in <tomcat8>/bin/catalina.sh file, but I think this is just workaround solution...

Almost 16 project can be deployed: CATALINA_OPTS="-Xms8g -Xmx12g -Xincgc -XX:MaxPermSize=256m"

Almost 12 projects can be deployed: CATALINA_OPTS="-Xms4g -Xmx8g -Xincgc -XX:MaxPermSize=256m"

Almost 5 projects can be deployed: CATALINA_OPTS="-Xms2g -Xmx2g -Xincgc -XX:MaxPermSize=256m"

vladak commented 5 years ago

How big are the sizes of your index data and especially the suggester data ? (per project)

I remember when we upgraded from 1.0 to 1.1 internally, the memory requirements grew enough so that OutOfMemoryError was hit. Dissecting the memory dump with an analyzer I figured out it was the suggester that prompted the memory increase and it was in fact legitimate.

This needs to be documented better on the wiki.

vladak commented 5 years ago

In general, this is the magic of doing capacity planning for Java application. Either by trial and error or by careful measurements+computation.

https://github.com/oracle/opengrok/wiki/Tuning-for-large-code-bases#web-application has a section that explains how to take the suggester data into account.

For reference, here's some data of a internal hiccup which happened when transitioning to 1.1. Some time after 1.1. was deployed the application server (Tomcat in our case) started crashing with OOM exception.

In top(1) (this is from Solaris machine) it looked like so:

   PID USERNAME NLWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
   992 webservd  112  10    0  418G   18G cpu/18 461.4H 59.66% java

by doing some tracing it was figured out that the process was evidently busy with GC chasing last remaining bits of Java heap.

Thus -XX:+HeapDumpOnOutOfMemoryError was added to the Java arguments of the app server and we waited for the next OOM to get a dump.

Analyzing the dump with MAT (ironically the Java heap of MAT had to be increased as well so it can actually complete the analysis) it was found that the Suggester was eating some 4GiB+ of data, more than half of the heap:

MAT-opengrok_suggester

Interestingly, only small portion of the projects contributed heavily to this size (https://en.wikipedia.org/wiki/Pareto_principle in action). From 300+ projects we had only 20 or so summed up to the 4 gigs. As @ahornace wrote, the suggester data for Linux kernel takes some 20MiB (5x10^6 terms).

The biggest Suggester footprint was ~300 MiB for a project, then it was quickly trailing:

# cd /opengrok/data/suggester
# ls -1 | while read proj; do echo -n $proj " "; gfind $proj -type f -name '*.wfst' -printf '%s\n' | awk '{ sum += $0; } END { print sum; }'; done | sort +1 -n | tail -20
proj1  35485144
proj2  37001192
proj3  37173219
proj4  40763143
proj5  42073189
proj6  42073364
proj7  42187526
proj8  42334809
proj9  45023858
proj10  46612295
proj11  52972279
proj12  62475424
proj13  64643156
proj14  65392272
proj15  67582078
proj16  72876722
proj17  73073343
proj18  112418978
proj19  126162613
proj20  325016849

So, in the end I summed the sizes of all *.wfst files under the Data Root, multiplied it by some constant and bumped the Java heap by that value. Here's a (rather telling) excerpt from Tomcat's setenv.sh:

# OpenGrok memory boost to cover all-project searches
# (7 MB * 247 projects + 300 MB for cache should be enough)
# 64-bit Java allows for more so let's use 8GB to be on the safe side.
# We might need to allow more for concurrent all-project searches.
# However, with OpenGrok 1.1 the suggester requires more memory for
# each project (in one case the suggester footprint was 4.5 GB) 
# so bump the 8 GB to 16 GB to be on the safe side.
JAVA_OPTS="$JAVA_OPTS -Xmx16g"

The moral of the story is that Java heap size needs to be set with regards to the worst case scenario (many all/multi project searches happening concurrently) and has ample space for growth of the indexed data (adding more projects / deployed webapps).

JustSong commented 5 years ago

Hi @vladak ,

Thanks a lot for your detail information. I think this can be written in OpenGrok document so that we can know about this.

Here is our OpenGrok project structure, we set up each project as a different URL location:

source1 => https://opengrok.studio.com/source1
source2 => https://opengrok.studio.com/source2
source3 => https://opengrok.studio.com/source3
...
source8 => https://opengrok.studio.com/source8

Our project's data size (25 GB * 24 Projects almost use 8 GB Memory):

|--- Source Code (25GB)
|--- index (6GB)
|--- xref (4.7 GB)
|--- historycache (3.5 GB)
|--- suggester (1.4 GB)

I have already set the argument -Xms12g -Xmx12g -Xincgc in <tomcat8>/bin/catalina.sh file and problem were resolved.

Thanks again for your kindly help!

vladak commented 5 years ago

You're welcome. It's reasonably well documented in the wikis now I think.

oracle / opengrok

High memory usage when projects increase #2713