oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.29k stars 739 forks source link

improve suggester throughput #4539

Closed vladak closed 4 months ago

vladak commented 5 months ago

This change addresses 2 problems with suggester:

The locking part of the change is done with the assumption that the lock object based synchronization in the Suggester class was primarily used to protect the insertions/queries to the projectData map. I addressed this using computeIfAbsent which is atomic for the ConcurrentHashMap used.

The rebuild execution part of the change was done by introducing a thread pool that is reused for the init/rebuild operations.

In general this should improve the throughput of the suggester.

While there, I added tunable for the search pool parallelism level and addressed a few nits in the code.

Also, changed the latency meters to record per suggester data times rather than aggregates as the latter makes limited sense, I think.

vladak commented 5 months ago

https://github.com/oracle/opengrok/wiki/Suggester will have to be updated with the new tunable.

vladak commented 5 months ago

Tested by running per-project indexing workflow (via opengrok-sync), checked there are some rebuild threads running, waited for some of them to complete and then requested rebuild of each project separately:

curl -s http://localhost:8080/source/api/v1/projects | jq '.[]' | sed 's/"//g' | while read proj; do curl -X PUT http://localhost:8080/source/api/v1/suggest/rebuild/$proj; done

and again rechecked the running threads using prstat -L -m and jstack. Here's the prstat output (the thread names are truncated so incomplete thread id is displayed):

   PID USERNAME   USR   SYS   TRP   TFL   DFL   LCK   SLP   LAT   VCX   ICX   SCL   SIG LWPID PROCESS/LWPNAME
 21563 webservd 94.90 0.318 0.014 0.000 1.680 3.026 0.000 0.063   104    33   470    22 12830 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 93.00 0.333 0.015 0.000 3.548 3.043 0.000 0.064    95    33   491    21 12847 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 90.55 0.656 0.015 0.000 5.665 3.036 0.000 0.075   119    31   682    23 12829 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 88.81 0.429 0.016 0.000 7.613 3.046 0.000 0.086   109    32   148    24 12823 java/OpenGrok-suggester-rebuild-1207
 21563 webservd 86.84 1.258 0.027 0.000 8.732 3.041 0.000 0.098   151    32  1815    22 12838 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 37.83 1.321 0.007 0.000 7.142 52.27 1.334 0.091   150    18  1137    18 12831 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 36.11 2.198 0.008 0.000 7.996 52.86 0.636 0.196   384    14  1556    16 12817 java/OpenGrok-suggester-rebuild-1207
 21563 webservd 36.96 1.334 0.007 0.000 6.035 53.13 2.170 0.364   685    17  1294     9 12841 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 34.63 3.024 0.011 0.000 9.124 52.34 0.736 0.139   229    16  2394    15 12822 java/OpenGrok-suggester-rebuild-1207
 21563 webservd 34.09 2.371 0.007 0.000 7.062 53.55 2.530 0.383   703    15  1811     9 12828 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 32.05 3.065 0.011 0.000 6.470 53.75 4.279 0.384   687    17  2120    10 12842 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 32.46 2.555 0.011 0.000 9.073 55.20 0.528 0.169   257    16  1633    13 12832 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 30.42 3.487 0.007 0.000 8.334 54.03 3.434 0.292   524    14  2143     7 12825 java/OpenGrok-suggester-rebuild-1207
 21563 webservd 30.98 2.364 0.007 0.000 8.261 55.35 2.704 0.327   579    17  1593     6 12852 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 30.78 2.364 0.008 0.000 8.191 55.34 2.999 0.325   576    17  1537     6 12835 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 27.06 4.833 0.007 0.000 8.399 54.94 4.401 0.364   622    16  2649     6 12833 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 26.04 4.663 0.008 0.000 8.196 55.90 4.860 0.332   612    14  2566     9 12844 java/OpenGrok-suggester-rebuild-1209
 21563 webservd 21.91 1.479 0.006 0.000 6.158 70.33 0.059 0.053    96     9  1315     4 12827 java/OpenGrok-suggester-rebuild-1208
 21563 webservd 7.490 1.846 0.003 0.000 0.981 88.49 1.028 0.158   293     4  1333     8 12845 java/OpenGrok-suggester-rebuild-1209
...

As for the jstack output, it is important that there multiple rebuild threads in RUNNABLE state with stacks like this:

"OpenGrok-suggester-rebuild-12077" #12077 prio=5 os_prio=64 cpu=963690.72ms elapsed=1704.64s tid=0x0000000007bf8000 nid=0x3217 runnable  [0x00007fff535fd000]
   java.lang.Thread.State: RUNNABLE
        at java.nio.ByteBuffer.get(java.base@11.0.7-internal/ByteBuffer.java:737)
        at java.nio.DirectByteBuffer.get(java.base@11.0.7-internal/DirectByteBuffer.java:319)
        at org.apache.lucene.store.ByteBufferGuard.getBytes(ByteBufferGuard.java:94)
        at org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:138)
        at org.apache.lucene.util.compress.LZ4.decompress(LZ4.java:106)
        at org.apache.lucene.codecs.lucene90.blocktree.CompressionAlgorithm$3.read(CompressionAlgorithm.java:45)
        at org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnumFrame.loadBlock(SegmentTermsEnumFrame.java:182)
        at org.apache.lucene.codecs.lucene90.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:539)
        at org.apache.lucene.index.LeafReader.docFreq(LeafReader.java:81)
        at org.apache.lucene.index.BaseCompositeReader.docFreq(BaseCompositeReader.java:197)
        at org.opengrok.suggest.SuggesterUtils.computeNormalizedDocumentFrequency(SuggesterUtils.java:108)
        at org.opengrok.suggest.SuggesterUtils.computeScore(SuggesterUtils.java:97)
        at org.opengrok.suggest.SuggesterProjectData$WFSTInputIterator.weight(SuggesterProjectData.java:638)
        at org.apache.lucene.search.suggest.SortedInputIterator.sort(SortedInputIterator.java:190)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:80)
        at org.apache.lucene.search.suggest.SortedInputIterator.<init>(SortedInputIterator.java:64)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup$WFSTInputIterator.<init>(WFSTCompletionLookup.java:272)
        at org.apache.lucene.search.suggest.fst.WFSTCompletionLookup.build(WFSTCompletionLookup.java:109)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:281)
        at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:268)
        at org.opengrok.suggest.SuggesterProjectData.rebuild(SuggesterProjectData.java:244)
        at org.opengrok.suggest.Suggester.lambda$getRebuildRunnable$5(Suggester.java:428)
        at org.opengrok.suggest.Suggester$$Lambda$4928/0x00000008015da440.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.7-internal/Executors.java:515)
        at java.util.concurrent.FutureTask.run(java.base@11.0.7-internal/FutureTask.java:264)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7-internal/ThreadPoolExecutor.java:1128)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7-internal/ThreadPoolExecutor.java:628)
        at java.lang.Thread.run(java.base@11.0.7-internal/Thread.java:834)

The projects for which the rebuild is already running will actually be serialized on the write lock in SuggesterProjectData#isInitialized() (newly added method):

"ForkJoinPool.commonPool-worker-53" #12424 daemon prio=5 os_prio=64 cpu=0.53ms elapsed=47.20s tid=0x0000000003770000 nid=0x33c0 waiting on condition  [0x00007fff3cafd000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@11.0.7-internal/Native Method)
        - parking to wait for  <0x0000000083f012a0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(java.base@11.0.7-internal/LockSupport.java:194)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.7-internal/AbstractQueuedSynchronizer.java:885)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.7-internal/AbstractQueuedSynchronizer.java:917)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.7-internal/AbstractQueuedSynchronizer.java:1240)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(java.base@11.0.7-internal/ReentrantReadWriteLock.java:959)
        at org.opengrok.suggest.SuggesterProjectData.isInitialized(SuggesterProjectData.java:257)
        at org.opengrok.suggest.Suggester.rebuild(Suggester.java:369)
        at org.opengrok.web.api.v1.suggester.provider.service.impl.SuggesterServiceImpl.rebuild(SuggesterServiceImpl.java:214)
        at org.opengrok.web.api.v1.controller.SuggesterController.lambda$rebuild$1(SuggesterController.java:185)
        at org.opengrok.web.api.v1.controller.SuggesterController$$Lambda$4951/0x00000008015d0840.run(Unknown Source)
        at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@11.0.7-internal/CompletableFuture.java:1736)
        at java.util.concurrent.CompletableFuture$AsyncRun.exec(java.base@11.0.7-internal/CompletableFuture.java:1728)
        at java.util.concurrent.ForkJoinTask.doExec(java.base@11.0.7-internal/ForkJoinTask.java:290)
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base@11.0.7-internal/ForkJoinPool.java:1020)
        at java.util.concurrent.ForkJoinPool.scan(java.base@11.0.7-internal/ForkJoinPool.java:1656)
        at java.util.concurrent.ForkJoinPool.runWorker(java.base@11.0.7-internal/ForkJoinPool.java:1594)
        at java.util.concurrent.ForkJoinWorkerThread.run(java.base@11.0.7-internal/ForkJoinWorkerThread.java:177)