Unsafe memory location error while accessing web page

Me4791 commented 4 years ago

OpenGrok version: 1.3.3 Git Version: 2.17 Tomcat version 9

Description: We are getting "a fault occurred in a recent unsafe memory access operation in compiled java code" message when trying to access a file under a project/repo. This occurs very often. Could you please help in resolving this one. This error goes away from few hours to days or sometime need to redeploy the WebApplication

Indexing Process: We are indexing the source continuously in same directory (ABC) using as workspace dir ABC Sub Dir: Data, Logs, src, etc

UnsafeMemory

vladak commented 4 years ago

When exactly this started happening ? Anything changed prior to that ? Anything else interesting in Tomcat logs ? What JVM do you use ?

IntermalError in native Java class does not look like something we can address.

vladak commented 4 years ago

For reference, here is the piece of code that triggers it: https://github.com/oracle/opengrok/blob/748d69c7ce0c716a7535e75a3134190445030208/opengrok-indexer/src/main/java/org/opengrok/indexer/search/DirectoryExtraReader.java#L79

vladak commented 4 years ago

What do you mean by the continuous indexing ?

Me4791 commented 4 years ago

Hello Respected Contributors, This issue was always there with every version of Opengrok but occurred not often. We are seeing this very often now after upgrading to version 1.3.3 Indexing Continuously means, We are indexing a source, once finished we start it again immediately so that we can grab the latest changes in source and is available on OpenGrok as asap. The source size is 1450 GB approximately at this time and it takes 3 days few hours to index. Hence we need to keep running our Indexing process.

There is nothing in tomcat logs. I am attaching a screenshot for JVM and Server build info.

Thanks! JVM

vladak commented 4 years ago

Not directly related however you should consider to switching to per project indexing if you have many projects. 3 days is too much I think.

idodeclare commented 4 years ago

I agree with @vladak. Three days for an incremental indexing seems very high. Maybe the indexing being continuous has potential to interfere with searches (though I would think Lucene would protect from that).

@vladak has a suggestion above to speed up indexing.

What do the logs show happening during the three days of indexing?

Another web app issue related to "memory" (though not this InternalError) was solved by @vladak upgrading to Java 11. Not sure if that's worthwhile to try.

Me4791 commented 4 years ago

The logs shows nothing unusual, going through indexing stuff and writing the files which aren't found. Previously, we were doing Indexing per project but that architecture doesn't fit to fulfill requirements. I will look into upgrading to Java 11. Thanks a lot!

idodeclare commented 4 years ago

@Me4791, if you're willing, would you run the following to extract some log rows of interest (with paths anonymized) and paste in a gist somewhere:

egrep 'HistoryGuru.createCache:|FileHistoryCache.store:|FileHistoryCache.finishStore:|Statistics.report: Done historycache|IndexDatabase.update: Starting traversal|Statistics.report: Done traversal|Statistics.report: Done indexing' opengrok0.0.log | perl -nle 'BEGIN { undef $/ }
my @p = m`(?<=[\x20\t])(/\S*)`gx;
for my $p (@p) {
    next if $s{$p};
    ++$i;
    s`\Q$p\E`/src/path$i`gx;
    $s{$p} = 1;
}
print;'

That will output as shown in the following example (ellipses are mine):

2020-06-11 13:02:23.354-0500 INFO t289 HistoryGuru.createCache: Creating historycache for /src/path1 (GitRepository) without renamed file handling
2020-06-11 13:02:23.354-0500 INFO t291 HistoryGuru.createCache: Creating historycache for /src/path2 (GitRepository) without renamed file handling
2020-06-11 13:02:23.354-0500 INFO t293 HistoryGuru.createCache: Creating historycache for /src/path3 (GitRepository) without renamed file handling
...
2020-06-11 13:02:23.376-0500 INFO t294 Statistics.report: Done historycache for /src/path8 (took 21 ms)
...
2020-06-11 13:02:47.334-0500 INFO t378 IndexDatabase.update: Starting traversal of directory /src/path40
...
2020-06-11 13:02:47.870-0500 INFO t378 Statistics.report: Done traversal of directory /src/path40 (took 536 ms)
...
2020-06-11 13:02:48.381-0500 INFO t378 Statistics.report: Done indexing of directory /src/path40 (took 510 ms)
...
2020-06-11 13:05:49.884-0500 INFO t1 Statistics.report: Done indexing data of all repositories (took 0:03:03)

Me4791 commented 4 years ago

Sure! Currently the indexing is in progress and I will post the output as soon it will be done Thanks!.

Me4791 commented 4 years ago

Question: I start to run 'repo sync' command to fetch source as soon as the indexing finishes, in the same src dir (in the same workspace). Does that causes any error? In other words, The indexing is done based off of the src folder synced.
Now, if I am syncying the source again in the same src dir, it will get the deltas hence the files will be changed. that causes the memory map to change as well. In this scenario if a user searches for a file that has been changed already in new source but the indexes are still pointing to the old source, hence won't be able to find the file at the same memory address. I am not sure if I am able to explain it well.

Alternatively, Does indexes depends upon the source code under src folder while performing a search?

Thanks!

idodeclare commented 4 years ago

No, OpenGrok wouldn’t keep your source files open for long. Each is read as quickly as possible for indexing or for displaying matching lines during search — and then closed.

The memory-mapped files are almost certainly Lucene index files under data/.

Me4791 commented 4 years ago

@idodeclare, If I am understanding your point correctly then the search has nothing to do with the source code under src folder once indexed. Correct? Thanks!

idodeclare commented 4 years ago

To complete a search, matching source files are quickly re-read to show matching lines. For example an index search result might indicate a particular search term matched at offset 437 in a source file. The source file is read, and the line containing offset 437 is presented to the user.

If your repo sync has updated a source file since indexing, then OpenGrok notices that the timestamp doesn't match the metadata in the index search result, and OpenGrok will attempt an ad-hoc re-analysis of a source file to find matching lines. Such re-analysis uses the older OpenGrok highlighting code and not the Lucene unified highlighter.

GeniusKaii commented 4 years ago

hi @Me4791 I met the similar issue that "There was an error! a fault occurred in a recent unsafe memory access operation in compiled Java code java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code", Did you solve it by upgrading to Java11?

Me4791 commented 4 years ago

Hello @kaychu2013, We still have that issue. This error may occur when someone is trying to access the files which have no reference in memory map. The source changes during fetching deltas while syncing from git, files changed but the indexes and memory map are still old. We are doing indexing during night time when there is no traffic followed by recycling tomcat once the index finishes. Upgrading to Java 11 may help but we didn't try. We are trying to move OpenGrok to Kubernettes cluster. Thanks!

oracle / opengrok

Unsafe memory location error while accessing web page #3163