oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.29k stars 739 forks source link

indexer might degrade to serial processing for the trailing project #4549

Closed vladak closed 4 months ago

vladak commented 4 months ago

When indexing bunch of projects from scratch with 1.13.4, I was watching CPU utilization in the 2nd phase of the indexing. There was bunch of bigger projects like Linux, FreeBSD, AOSP. During most of the time, the CPUs were loaded, however at the end of the indexing when only AOSP remained, there was just a single ForkJoin thread processing the files, all the other threads from the ForkJoin pool were in the WAITING state in ForkJoinPool.runWorker() so the indexer was basically degraded to serial processing, making the machine underutilized and the whole reindex taking too much time. This become even more apparent due to the fact that annotation cache was enabled so the sole thread spent the most of time generating the annotation cache entry for given file before moving onto the next file.

I thought the purpose of the ForkJoin pool is that other threads from the pool can steal the work from the dequeues of other threads however this does not seem to be happening at that point for some reason.

The indexer was run as follows:

sudo -u webservd LANG="en_US.UTF-8" LC_ALL="en_US.UTF-8" \
    nice -n 20 /opengrok/dist/bin/venv/bin/opengrok-indexer -C \
    --java=/jdk/openjdk11/bin/java -J=-Xmx48g -J=-server \
    -J=-Djava.util.logging.config.file=/var/tmp/upgrade-logging.conf \
    --jar /opengrok/dist/lib/opengrok.jar \
    -e LC_ALL=en_US.UTF-8 \
    -- \
    --renamedHistory 'on' -r dirbased -G -m '256' --leadingWildCards 'on' \
    --threads 48 --historyThreads 48 --historyFileThreads 32 \
    -c /usr/local/bin/ctags -o /opengrok/etc/ctags.config \
    -H \
    -P -S \
    -R /opengrok/etc/readonly_configuration.xml \
    -s /ws-local \
    -d /opengrok/data \
    -W /opengrok/etc/configuration.xml \
    --progress \
    >/opengrok/log/reindex.log 2>&1

where the read-only configuration had annotation cache enabled globally.

jstack output: jstack-end-of-indexing.txt

The indexer log snippet at the end of the indexing:

2024-02-21 13:54:31.188+0000 INFO t20339 Progress.logIt: Progress: 975100 (99.88%) indexing '/AOSP'
2024-02-21 14:12:00.132+0000 INFO t20339 Progress.logIt: Progress: 975200 (99.89%) indexing '/AOSP'
2024-02-21 14:28:09.894+0000 INFO t20339 Progress.logIt: Progress: 975300 (99.90%) indexing '/AOSP'
2024-02-21 14:52:56.339+0000 INFO t20339 Progress.logIt: Progress: 975400 (99.91%) indexing '/AOSP'
2024-02-21 15:17:34.781+0000 INFO t20339 Progress.logIt: Progress: 975500 (99.92%) indexing '/AOSP'
2024-02-21 15:30:03.492+0000 INFO t20339 Progress.logIt: Progress: 975600 (99.93%) indexing '/AOSP'
2024-02-21 15:40:19.976+0000 INFO t20339 Progress.logIt: Progress: 975700 (99.94%) indexing '/AOSP'
2024-02-21 15:51:36.286+0000 INFO t20339 Progress.logIt: Progress: 975800 (99.95%) indexing '/AOSP'
2024-02-21 16:03:32.027+0000 INFO t20339 Progress.logIt: Progress: 975900 (99.96%) indexing '/AOSP'
2024-02-21 16:15:29.803+0000 INFO t20339 Progress.logIt: Progress: 976000 (99.97%) indexing '/AOSP'
...
2024-02-21 16:48:04.992+0000 INFO t1 Statistics.logIt: Done indexing data of all repositories (took 1 day 21:59:50)
2024-02-21 16:48:05.842+0000 INFO t1 Indexer.writeConfigToFile: Writing configuration to '/opengrok/etc/configuration.xml'
2024-02-21 16:48:06.317+0000 INFO t1 Indexer.writeConfigToFile: Done writing configuration to '/opengrok/etc/configuration.xml'
2024-02-21 16:48:06.317+0000 INFO t1 Statistics.logIt: Indexer finished (took 2 days 06:27:12)

It basically took ~15 minutes to process 100 files. The prstat -L output at that point shows the sole processing thread:

   PID USERNAME     SIZE      RSS STATE   PRI NICE      TIME    CPU LWPID PROCESS/LWPNAME
  1977 webservd   64703M   63526M cpu22     0   19  32:56:05 3.119%  7040 java/OpenGrok-ForkJoinPool-6878
  1977 webservd   64703M   63526M sleep     2   19   0:11:16 0.009%     7 java/G1 Young RemSet Sampling
   791 root      156492K   45644K sleep    59    0   3:20:39 0.006%     9 sstored
   260 root           0K       0K sleep    99  -20   0:11:56 0.003%    20 zpool-data/zio_write_issue
   791 root      156492K   45644K sleep    59    0   1:46:43 0.003%    12 sstored
     6 root           0K       0K sleep    99  -20   0:00:01 0.000%   110 zpool-rpool/zio_free_issue
     6 root           0K       0K sleep    99  -20   0:00:01 0.000%   109 zpool-rpool/zio_free_issue
     6 root           0K       0K sleep    99  -20   0:00:01 0.000%   108 zpool-rpool/zio_free_issue
     6 root           0K       0K sleep    99  -20   0:00:01 0.000%   107 zpool-rpool/zio_free_issue
...

This is a machine with 32 CPUs and 128GB RAM.

vladak commented 4 months ago

The IndexDatabase#indexParallel() implementation uses custom ForkJoin pool which some places do not recommend. There are some indications that the intended parallelism level does not have to be achieved in ForkJoin pool in the face of blocked I/O, which would be certainly the case of annotation cache generation as Git blame is I/O intensive operation.

Maybe it is worth clarifying why ForkJoin is actually used and experimenting with plain thread pool.