oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.34k stars 745 forks source link

indexer of AOSP fails with OutOfMemoryError and other exceptions #2647

Closed brightchuh closed 5 years ago

brightchuh commented 5 years ago

Versions

Steps

, while other projects are OK!

Log

vladak commented 5 years ago

I see quite a few exceptions in the (huge) log, e.g.:

java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError
        at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
        at org.opengrok.indexer.index.IndexDatabase.indexParallel(IndexDatabase.java:1227)
        at org.opengrok.indexer.index.IndexDatabase.update(IndexDatabase.java:509)
        at org.opengrok.indexer.index.IndexDatabase$1.run(IndexDatabase.java:227)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
        at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1005)
        ... 8 more
Caused by: java.lang.OutOfMemoryError
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
        at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
        at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
        at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
        at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
        at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:496)
        at org.opengrok.indexer.index.IndexDatabase.lambda$indexParallel$2(IndexDatabase.java:1180)
        at java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1424)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3332)
        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
        at java.lang.StringBuffer.append(StringBuffer.java:270)
        at java.io.StringWriter.write(StringWriter.java:101)
...

and

Failed with unexpected RuntimeException
java.util.ConcurrentModificationException
        at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1558)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:747)
        at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:721)
        at java.util.stream.AbstractTask.compute(AbstractTask.java:316)
        at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401)
        at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
        at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at org.opengrok.indexer.index.PendingFileCompleter.completeRenamings(PendingFileCompleter.java:195)
        at org.opengrok.indexer.index.PendingFileCompleter.complete(PendingFileCompleter.java:171)
        at org.opengrok.indexer.index.IndexDatabase.finishWriting(IndexDatabase.java:1653)
        at org.opengrok.indexer.index.IndexDatabase.update(IndexDatabase.java:532)
        at org.opengrok.indexer.index.IndexDatabase$1.run(IndexDatabase.java:227)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

<E4><B8><80><E6><9C><88> 30, 2019 11:16:01 <E4><B8><8B><E5><8D><88> org.opengrok.indexer.index.IndexDatabase$1 run
<E4><B8><A5><E9><87><8D>: Problem updating lucene index database: 
java.util.ConcurrentModificationException
        at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1558)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:747)
        at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:721)
        at java.util.stream.AbstractTask.compute(AbstractTask.java:316)
        at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401)
        at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
        at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at org.opengrok.indexer.index.PendingFileCompleter.completeRenamings(PendingFileCompleter.java:195)
        at org.opengrok.indexer.index.PendingFileCompleter.complete(PendingFileCompleter.java:171)
        at org.opengrok.indexer.index.IndexDatabase.finishWriting(IndexDatabase.java:1653)
        at org.opengrok.indexer.index.IndexDatabase.update(IndexDatabase.java:532)
        at org.opengrok.indexer.index.IndexDatabase$1.run(IndexDatabase.java:227)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Firstly, you should solve the heap sizing problem by making the indexer run with larger heap, see https://github.com/oracle/opengrok/wiki/Tuning-for-large-code-bases#indexer

vladak commented 5 years ago

Also, how do you run the indexer currently ?

I see that you are trying to index AOSP which is definitely one example of project that needs Indexer tuning.

vladak commented 5 years ago

I renamed the synopsis because the absent xrefs are just symptom, not the cause.

vladak commented 5 years ago

Also see https://github.com/oracle/opengrok/issues/2074#issuecomment-383083816

tulinkry commented 5 years ago

The concurrent modification exception seems like a trouble on our side.

vladak commented 5 years ago

Hard to say. It might be related to improper handling of the unchecked exception.

čt 31. 1. 2019 13:42 odesílatel Kryštof Tulinger notifications@github.com napsal:

The concurrent modification exception seems like a trouble on our side.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2647#issuecomment-459330323, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDIuDHf6WF9QpCTK5Wh1_Cdm-YIguks5vIuSvgaJpZM4abcBd .

brightchuh commented 5 years ago

Also, how do you run the indexer currently ?

I see that you are trying to index AOSP which is definitely one example of project that needs Indexer tuning.

yes, AOSP is one of the projects under source root.

index command: java -Xmx2048m -jar $OPENGROK/lib/opengrok.jar -s $CODE -d $OPENGROK/data -R $OPENGROK/etc/ro_configuration.xml -W $OPENGROK/etc/configuration.xml -U http://localhost:8080/opengrok -H -P -S -G -v -t 4 2>&1|tee $OPENGROK/log/index.log

vladak commented 5 years ago

As was mentioned before 2 GiB heap is really just too small.

pá 1. 2. 2019 9:15 odesílatel brightchuh notifications@github.com napsal:

Also, how do you run the indexer currently ?

I see that you are trying to index AOSP which is definitely one example of project that needs Indexer tuning.

yes, AOSP is one of the projects under source root.

index command: java -Xmx2048m -jar $OPENGROK/lib/opengrok.jar -s $CODE -d $OPENGROK/data -R $OPENGROK/etc/ro_configuration.xml -W $OPENGROK/etc/configuration.xml -U http://localhost:8080/opengrok -H -P -S -G -v -t 4 2>&1|tee $OPENGROK/log/index.log

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2647#issuecomment-459641936, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDGaPEl2GhwuA2X6679rReKqeCj5Iks5vI_e1gaJpZM4abcBd .

vladak commented 5 years ago

I'd suggest to remove project data for the affected project and reindex.

vladak commented 5 years ago

I am curious whether the ConcurrentModificationException goes away once the indexer completes without OutOfMemoryError.

brightchuh commented 5 years ago

sorry that i can not give you the answer right now. i'm having a long leave. another question is that my computer's ram is 4g. Could i increase swap space to meet the need of memory for opengrok?

vladak commented 5 years ago

Possibly, however this will make the indexer and likely the webapp really slow unless you have very fast I/O.

pá 1. 2. 2019 16:10 odesílatel brightchuh notifications@github.com napsal:

sorry that i can not give you the answer right now. i'm having a long leave. another question is that my computer's ram is 4g. Could i increase swap space to meet the need of memory for opengrok?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2647#issuecomment-459754151, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDFQu6oeZJcDWoxQmoD3iExqGZyV6ks5vJFjbgaJpZM4abcBd .

vladak commented 5 years ago

Any luck with re-running the indexer with more heap ?

brightchuh commented 5 years ago

still on vacation

brightchuh commented 5 years ago

Any luck with re-running the indexer with more heap ?

@vladak Good news! After I increased java heap size limit to 12G, the indexer runs normally. And, increasing Swap size is really helpfull, but the IO is very slow. It took about 5 days to complete the initial indexing.

vladak commented 5 years ago

D you still see the other exceptions ?

Dne po 25. 2. 2019 3:26 uživatel brightchuh notifications@github.com napsal:

Any luck with re-running the indexer with more heap ?

@vladak https://github.com/vladak Good news! After I increased java heap size limit to 12G, the indexer runs normally. And, increasing Swap size is really helpfull, but the IO is very slow. It took about 5 days to complete the initial indexing.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2647#issuecomment-466850131, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDJCQF2XPYLT7BDRZ5CeJo21o8Sifks5vQ0nZgaJpZM4abcBd .

brightchuh commented 5 years ago

No OutOfMemoryError, but lots of IOException.

log.gz

vladak commented 5 years ago

These are due to inability to run git for some reason, e.g.:

<E4><B8><A5><E9><87><8D>: Failed to read from process: git
java.io.IOException: Cannot run program "git" (in directory "/work/code/nanomsg"): error=2, <E6><B2><A1><E6><9C><89><E9><82><A3><E4><B8><AA><E6><96><87><E4><BB><B6><E6><88><96><E7><9B><AE><E5><BD><95>
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
        at org.opengrok.indexer.util.Executor.exec(Executor.java:174)
        at org.opengrok.indexer.util.Executor.exec(Executor.java:134)
        at org.opengrok.indexer.util.Executor.exec(Executor.java:123)
        at org.opengrok.indexer.history.GitRepository.buildTagList(GitRepository.java:599)
        at org.opengrok.indexer.history.RepositoryFactory.getRepository(RepositoryFactory.java:161)
        at org.opengrok.indexer.history.RepositoryFactory.getRepository(RepositoryFactory.java:84)
        at org.opengrok.indexer.history.HistoryGuru.addRepositories(HistoryGuru.java:393)
        at org.opengrok.indexer.history.HistoryGuru.addRepositories(HistoryGuru.java:411)
        at org.opengrok.indexer.history.HistoryGuru.addRepositories(HistoryGuru.java:466)
        at org.opengrok.indexer.configuration.RuntimeEnvironment.setRepositories(RuntimeEnvironment.java:759)
        at org.opengrok.indexer.index.Indexer.prepareIndexer(Indexer.java:998)
        at org.opengrok.indexer.index.Indexer.main(Indexer.java:319)
Caused by: java.io.IOException: error=2, <E6><B2><A1><E6><9C><89><E9><82><A3><E4><B8><AA><E6><96><87><E4><BB><B6><E6><88><96><E7><9B><AE><E5><BD><95>
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
        at java.lang.ProcessImpl.start(ProcessImpl.java:134)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        ... 12 more

I cannot decipher the error strings. Might be some permissions/environment problem.

vladak commented 5 years ago

errno 2 is usually ENOENT so either the git program is not installed or not present in the PATH.

brightchuh commented 5 years ago

yes, you're right! I forgot to install git in my computer. Sigh... So, this issue is to be closed. Thanks a lot for your support!

vladak commented 5 years ago

You're welcome.