Open Me4791 opened 4 years ago
When exactly this started happening ? Anything changed prior to that ? Anything else interesting in Tomcat logs ? What JVM do you use ?
IntermalError
in native Java class does not look like something we can address.
For reference, here is the piece of code that triggers it: https://github.com/oracle/opengrok/blob/748d69c7ce0c716a7535e75a3134190445030208/opengrok-indexer/src/main/java/org/opengrok/indexer/search/DirectoryExtraReader.java#L79
What do you mean by the continuous indexing ?
Hello Respected Contributors, This issue was always there with every version of Opengrok but occurred not often. We are seeing this very often now after upgrading to version 1.3.3 Indexing Continuously means, We are indexing a source, once finished we start it again immediately so that we can grab the latest changes in source and is available on OpenGrok as asap. The source size is 1450 GB approximately at this time and it takes 3 days few hours to index. Hence we need to keep running our Indexing process.
There is nothing in tomcat logs. I am attaching a screenshot for JVM and Server build info.
Thanks!
Not directly related however you should consider to switching to per project indexing if you have many projects. 3 days is too much I think.
I agree with @vladak. Three days for an incremental indexing seems very high. Maybe the indexing being continuous has potential to interfere with searches (though I would think Lucene would protect from that).
@vladak has a suggestion above to speed up indexing.
What do the logs show happening during the three days of indexing?
Another web app issue related to "memory" (though not this InternalError
) was solved by @vladak upgrading to Java 11. Not sure if that's worthwhile to try.
The logs shows nothing unusual, going through indexing stuff and writing the files which aren't found. Previously, we were doing Indexing per project but that architecture doesn't fit to fulfill requirements. I will look into upgrading to Java 11. Thanks a lot!
@Me4791, if you're willing, would you run the following to extract some log rows of interest (with paths anonymized) and paste in a gist somewhere:
egrep 'HistoryGuru.createCache:|FileHistoryCache.store:|FileHistoryCache.finishStore:|Statistics.report: Done historycache|IndexDatabase.update: Starting traversal|Statistics.report: Done traversal|Statistics.report: Done indexing' opengrok0.0.log | perl -nle 'BEGIN { undef $/ }
my @p = m`(?<=[\x20\t])(/\S*)`gx;
for my $p (@p) {
next if $s{$p};
++$i;
s`\Q$p\E`/src/path$i`gx;
$s{$p} = 1;
}
print;'
That will output as shown in the following example (ellipses are mine):
2020-06-11 13:02:23.354-0500 INFO t289 HistoryGuru.createCache: Creating historycache for /src/path1 (GitRepository) without renamed file handling
2020-06-11 13:02:23.354-0500 INFO t291 HistoryGuru.createCache: Creating historycache for /src/path2 (GitRepository) without renamed file handling
2020-06-11 13:02:23.354-0500 INFO t293 HistoryGuru.createCache: Creating historycache for /src/path3 (GitRepository) without renamed file handling
...
2020-06-11 13:02:23.376-0500 INFO t294 Statistics.report: Done historycache for /src/path8 (took 21 ms)
...
2020-06-11 13:02:47.334-0500 INFO t378 IndexDatabase.update: Starting traversal of directory /src/path40
...
2020-06-11 13:02:47.870-0500 INFO t378 Statistics.report: Done traversal of directory /src/path40 (took 536 ms)
...
2020-06-11 13:02:48.381-0500 INFO t378 Statistics.report: Done indexing of directory /src/path40 (took 510 ms)
...
2020-06-11 13:05:49.884-0500 INFO t1 Statistics.report: Done indexing data of all repositories (took 0:03:03)
Sure! Currently the indexing is in progress and I will post the output as soon it will be done Thanks!.
Question:
I start to run 'repo sync' command to fetch source as soon as the indexing finishes, in the same src dir (in the same workspace). Does that causes any error?
In other words, The indexing is done based off of the src folder synced.
Now, if I am syncying the source again in the same src dir, it will get the deltas hence the files will be changed. that causes the memory map to change as well. In this scenario if a user searches for a file that has been changed already in new source but the indexes are still pointing to the old source, hence won't be able to find the file at the same memory address.
I am not sure if I am able to explain it well.
Alternatively, Does indexes depends upon the source code under src folder while performing a search?
Thanks!
No, OpenGrok wouldn’t keep your source files open for long. Each is read as quickly as possible for indexing or for displaying matching lines during search — and then closed.
The memory-mapped files are almost certainly Lucene index files under data/
.
@idodeclare, If I am understanding your point correctly then the search has nothing to do with the source code under src folder once indexed. Correct? Thanks!
To complete a search, matching source files are quickly re-read to show matching lines. For example an index search result might indicate a particular search term matched at offset 437 in a source file. The source file is read, and the line containing offset 437 is presented to the user.
If your repo sync
has updated a source file since indexing, then OpenGrok notices that the timestamp doesn't match the metadata in the index search result, and OpenGrok will attempt an ad-hoc re-analysis of a source file to find matching lines. Such re-analysis uses the older OpenGrok highlighting code and not the Lucene unified highlighter.
hi @Me4791 I met the similar issue that "There was an error! a fault occurred in a recent unsafe memory access operation in compiled Java code java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code", Did you solve it by upgrading to Java11?
Hello @kaychu2013, We still have that issue. This error may occur when someone is trying to access the files which have no reference in memory map. The source changes during fetching deltas while syncing from git, files changed but the indexes and memory map are still old. We are doing indexing during night time when there is no traffic followed by recycling tomcat once the index finishes. Upgrading to Java 11 may help but we didn't try. We are trying to move OpenGrok to Kubernettes cluster. Thanks!
OpenGrok version: 1.3.3 Git Version: 2.17 Tomcat version 9
Description: We are getting "a fault occurred in a recent unsafe memory access operation in compiled java code" message when trying to access a file under a project/repo. This occurs very often. Could you please help in resolving this one. This error goes away from few hours to days or sometime need to redeploy the WebApplication
Indexing Process: We are indexing the source continuously in same directory (ABC) using as workspace dir ABC Sub Dir: Data, Logs, src, etc