oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.34k stars 745 forks source link

Failed to get xref file #3444

Closed chkp-baselz closed 3 years ago

chkp-baselz commented 3 years ago

Hey :)

After indexing using this command java -Djava.util.logging.config.file=/opengrok/etc/logging.properties -jar /opengrok/dist/lib/opengrok.jar --ctags /usr/local/bin/ctags -s /opengrok/src -d /opengrok/data -H -P -S -G -W /opengrok/etc/configuration.xml -U http://localhost:8080/source/ some files can't be open although it could be read and exist in src folder image

Can't find anything special in log, but when checked found this for almost all projects in catalina.out SEVERE [ForkJoinPool-331-worker-19] org.opengrok.suggest.Suggester.lambda$getInitRunnable$1 Could not initialize suggester data for aftershock java.io.FileNotFoundException: /opengrok/data/suggester/aftershock/defs.wfst (Permission denied) at java.base/java.io.FileOutputStream.open0(Native Method) at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298) at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237) at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:187) at org.opengrok.suggest.SuggesterProjectData.store(SuggesterProjectData.java:273) at org.opengrok.suggest.SuggesterProjectData.build(SuggesterProjectData.java:250) at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:154) at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:229) at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1407) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

Appreciate your help Basel

chkp-baselz commented 3 years ago

OpenGrok version 1-5-11

vladak commented 3 years ago

Does the cpsdk.cpp.gz file exist under the appropriate directory under /opengrok/data/xref/ ?

vladak commented 3 years ago

In general the error message should be changed to something more helpful.

chkp-baselz commented 3 years ago

@vladak yes it exist

ls /opengrok/data/xref/ignis_main/tor_ssm_sdk/mellanox/ | grep cpsdk.cpp.gz

cpsdk.cpp.gz
cpsdk.cpp.gz.org_opengrok
vladak commented 3 years ago

The file with the org_opengrok suffix hints at a problem I think. Normally this file is only used during moving the temporary file created during indexing. What are the sizes of these files ?

chkp-baselz commented 3 years ago

image

I have ran index tonight, so the file with the org_opengrok should be indexed?

vladak commented 3 years ago

The file with the org_opengrok suffix is a leftover that should not really be there (assuming that the indexer finished). Normally the file is used as a temporary file during indexing. The new xref contents are written to it and then it is moved to the original position. Check indexer log for any sings of related trouble.

vladak commented 3 years ago

Also, try tracing system calls of the Tomcat (I assume) process to see what file it is actually trying to read and what is the outcome.

chkp-baselz commented 3 years ago

@vladak Can't find anything else but this image this file is also "Failed to get xref file" image

vladak commented 3 years ago

reminds me of #3397

vladak commented 3 years ago

Assuming this is Linux, could you try tracing the Tomcat process with something like

sudo strace -e openat,stat -f -p <TOMCAT_PID>

then try to load the xref page for the file in trouble and see what does strace say for the .gz file.

vladak commented 3 years ago

Looking into the source, list.jsp uses checkFile() and the error message is used when it returns null: https://github.com/oracle/opengrok/blob/ea52985050d66cb32f7198addc401bcc9f2ba4e6/opengrok-web/src/main/webapp/list.jsp#L450-L461 This can happen not only if the file is missing but also when the last modification time of the xref file is bigger than or equal to the last modification time of the source file: https://github.com/oracle/opengrok/blob/ea52985050d66cb32f7198addc401bcc9f2ba4e6/opengrok-web/src/main/java/org/opengrok/web/PageConfig.java#L1212-L1227 I.e. it will refuse to print the xref if it thinks it is stale so check the times of the files with e.g. the stat command. The existence of the org_opengrok suffixed file explains this I think - the indexer failed to move the temporary file with the new xref to the original xref and hence the xref is not fresh anymore.

chkp-baselz commented 3 years ago

opengrok trace.txt @vladak here's the trace output once tried to access the file

chkp-baselz commented 3 years ago

@vladak deleting whole files that has org_opengrok and indexing again would help? or it's better to wipe data folder and re-index? what do you suggest

vladak commented 3 years ago

opengrok trace.txt @vladak here's the trace output once tried to access the file

Strangely I don't see any open/stat for any file in data root (/opengrok/data), maybe you can try with reloading the page with Ctrl+Shift+R.

vladak commented 3 years ago

opengrok trace.txt @vladak here's the trace output once tried to access the file

Strangely I don't see any open/stat for any file in data root (/opengrok/data), maybe you can try with reloading the page with Ctrl+Shift+R.

To recover just touch the affected files under the source root and reindex - hopefully it will not hit the problem with moving the temporary file this time. No data wipe out is necessary.

chkp-baselz commented 3 years ago

@vladak Ctrl+Shift+R didn't help What do you mean by touching it? `ls -R /opengrok/data/xref/ | grep .org_opengrok | wc -l

793793 ` I have 793793 files

chkp-baselz commented 3 years ago

@vladak I run index on a daily basis, so If should only run index again I don't think it's working

vladak commented 3 years ago

@vladak Ctrl+Shift+R didn't help

I mean for the purpose of gathering the system call trace.

What do you mean by touching it? `ls -R /opengrok/data/xref/ | grep .org_opengrok | wc -l

793793 ` I have 793793 files

I meant this:

touch /opengrok/src/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp

What will happen is that once the indexer runs for the next time, it will see that the time stamp of the cpsdk.cpp file has changed (compared to the time stamp stored in the index) and this should regenerate the xref file so the next time the web app will try to display the xref file the timestamp comparison will check out.

vladak commented 3 years ago

Before you do that just run:

stat /opengrok/src/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp /opengrok/data/xref/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp.gz

and observe the last modified time stamps.

chkp-baselz commented 3 years ago
  File: /opengrok/src/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp
  Size: 561647          Blocks: 1104       IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 199308388   Links: 1
Access: (0757/-rwxr-xrwx)  Uid: ( 1000/ builder)   Gid: ( 1002/      fw)
Context: unconfined_u:object_r:default_t:s0
Access: 2021-03-02 05:06:45.032539505 +0200
Modify: 2021-02-26 04:01:32.452393000 +0200
Change: 2021-03-02 04:32:55.135630007 +0200
 Birth: -
  File: /opengrok/data/xref/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp.gz
  Size: 362329          Blocks: 712        IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 2478871112  Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/ builder)   Gid: (    0/    root)
Context: unconfined_u:object_r:default_t:s0
Access: 2021-02-10 19:26:48.037790757 +0200
Modify: 2021-02-10 19:26:48.325790746 +0200
Change: 2021-02-10 23:39:08.616249287 +0200
 Birth: -
vladak commented 3 years ago

Yep, this is the problem. W.r.t. last modified time, /opengrok/src/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp is newer than /opengrok/data/xref/ignis_main/tor_ssm_sdk/mellanox/cpsdk.cpp.gz and therefore checkFile() considers the xref as stale, returning null which leads to the error message in the web app.

chkp-baselz commented 3 years ago

@vladak so now should I touch all damaged files?

Any idea how this happened? maybe stooping index once?

vladak commented 3 years ago

What should really happen in the web app is that once the staleness is detected, web app should try to regenerate the xref on the fly (it is totally capable of that already). Normally I'd consider this bug to be a duplicate of #3397 however will keep it open because of this.

vladak commented 3 years ago

@vladak so now should I touch all damaged files?

yep. in source root, that is. and reindex.

Any idea how this happened? maybe stooping index once?

Could be, however I suspect there may something at play here given this was also reported in #3397. The stale xref is just a symptom of potential problem in PendingFileCompleter (or nearby code) or its assumptions.

chkp-baselz commented 3 years ago

Hey @vladak

touching the file and re-indexing didn't resolve the issue, still same

output before and after touching the file image

output of stat after re-indexing image image

Thank you

vladak commented 3 years ago

Try removing the cpsdk.cpp.gz* files and reindex.

chkp-baselz commented 3 years ago

@vladak to index this specific project I should run this command right? java -Djava.util.logging.config.file=/opengrok/etc/logging.properties -jar /opengrok/dist/lib/opengrok.jar --ctags /usr/local/bin/ctags -s /opengrok/src -d /opengrok/data -H -P -S -G -W /opengrok/etc/configuration.xml -U http://localhost:8080/source/ ignis_main

vladak commented 3 years ago

For per project indexing avoid -W, -S and use -R /opengrok/etc/configuration.xml.

chkp-baselz commented 3 years ago

@vladak so it's java -Djava.util.logging.config.file=/opengrok/etc/logging.properties -jar /opengrok/dist/lib/opengrok.jar --ctags /usr/local/bin/ctags -s /opengrok/src -d /opengrok/data -H -P -G -R /opengrok/etc/configuration.xml -U http://localhost:8080/source/ ignis_main right?

vladak commented 3 years ago

That should work. It will be logging to the same place as the original indexer, though. Normally one uses separate logging.properties file that stores the per project index log in a separate file.

chkp-baselz commented 3 years ago

@vladak that worked and the file is shown now thnx :)

So what do you suggest doing for the rest of the files? should I touch all files and remove both files .gz and reindex?

vladak commented 3 years ago

Yep, these files should receive the same treatment.

chkp-baselz commented 3 years ago

@vladak it worked thank you

chkp-baselz commented 3 years ago

@vladak on a diffrent topic, I have add to pageheader.jspf OpenGrok logo to easily return home, but I have this merged text in any source, (merging of list details an menu), could you please tell me where to add an empty div to resolve this? image this line have been add to pageheader.jspf <a href="<%= request.getContextPath() %>/"><span id="MastheadLogo"></span></a>

vladak commented 3 years ago

Please open a new issue for that. In overall, I'd discourage anyone from changing any of the JSP files.