oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.33k stars 746 forks source link

Webapp unable to read file within a nested git repository referred by a symlink #4036

Open mmmarq opened 1 year ago

mmmarq commented 1 year ago

Describe the bug

I have a project indexed (repo) that is composed by some git projects. One of this git projects has "nested" git project that is a symbolic link to original git project available in the project root folder. When I try to open a file into this "nested" project, web server fail to open it and returns "Error reading file" message.

I realize that the commit SHA1 into URL ending is correct if you consider original project location, but looks like webapp tries to recover that SHA1 commit from current project, what causes the error. If I change that URL SHA1 commit to "unknown", webapp can show file content because (I think) it reads file from file system instead of try to recover it from repository.

Is there any additional config that I should use while indexing or at config file to prevent this error? I have tried "--nestingMaximum" without success.

OpenGrok: 1.7.35
JDK: openjdk12.0.2-x64
Tomcat: apache-tomcat-10.0.12
OS: Ubuntu 18.04.4 LTS

To Reproduce Create and index a git repository structure like this:

/opengrok_sources/collection/project1 (git repo)
/opengrok_sources/collection/project1/a.txt
/opengrok_sources/collection/project1/b.txt
/opengrok_sources/collection/project1/docs/readme.txt

/opengrok_sources/collection/project2 (git repo)
/opengrok_sources/collection/project2/c.txt
/opengrok_sources/collection/project2/d.txt
/opengrok_sources/collection/project2/docs -> ../project1/docs

Index command:

/opengrok/opengrok-tools/bin/opengrok-indexer --java /jre/bin/java \
--java_opts=-server --java_opts=-Xmx24g --java_opts=-Dorg.opengrok.indexer.history.git=/apps/android/bin/git \
--jar /opengrok/lib/opengrok.jar -- --ctags /tools/ctags/ctags \
--dataRoot /opengrok_data -i d:'.repo' -i d:'*.git' -i d:'.git' -H --projects --depth 50 \
--nestingMaximum 50 --economical --search --source /opengrok_sources \
-R /opengrok_etc/read-only.xml --writeConfig /opengrok_etc/configuration.xml

Expected behavior

Open "readme.txt" file under project2 on browser (WILL FAIL WITH "Error reading file"):

https://opengrok/mmm/xref/collection/project2/docs/readme.txt?r=9ec4a9b6

Open "readme.txt" file under project2 on browser (Changing commit SHA1 by unknown will works):

https://opengrok/mmm/xref/collection/project2/docs/readme.txt?r=unknown

Opening original file location works fine:

https://opengrok/mmm/xref/collection/project1/docs/readme.txt?r=9ec4a9b6

Additional context

I can see that even using "--economical" flag, indexer have created a symlink at project2 xref folder:

$ ls -la /opengrok_data/xref/collection/project2/
total 8
drwxr-xr-x 2 sse postfix 4096 Sep 12 11:50 .
drwxr-xr-x 3 sse postfix 4096 Sep 12 11:50 ..
lrwxrwxrwx 1 sse postfix   16 Sep 12 11:50 docs -> ../project1/docs

Configuration file content (xml)

<?xml version="1.0" encoding="UTF-8"?>
<java version="12.0.2" class="java.beans.XMLDecoder">
 <object class="org.opengrok.indexer.configuration.Configuration" id="Configuration0">
  <void property="cmds">
   <object class="java.util.Collections" method="unmodifiableMap">
    <object class="java.util.HashMap">
     <void method="put">
      <string>org.opengrok.indexer.history.ClearCaseRepository</string>
      <string>cleartool</string>
     </void>
     <void method="put">
      <string>org.opengrok.indexer.history.PerforceRepository</string>
      <string>p4</string>
     </void>
    </object>
   </object>
  </void>
  <void property="ctags">
   <string>/tools/ctags/ctags</string>
  </void>
  <void property="dataRoot">
   <string>/opengrok_data</string>
  </void>
  <void property="generateHtml">
   <boolean>false</boolean>
  </void>
  <void id="IgnoredNames0" property="ignoredNames">
   <void id="IgnoredDirs0" property="ignoredDirs">
    <void property="items">
     <void method="add">
      <string>.repo</string>
     </void>
     <void method="add">
      <string>*.git</string>
     </void>
     <void method="add">
      <string>.git</string>
     </void>
     <void method="add">
      <string>.hg</string>
     </void>
     <void method="add">
      <string>.bk</string>
     </void>
     <void method="add">
      <string>.bzr</string>
     </void>
     <void method="add">
      <string>.svn</string>
     </void>
     <void method="add">
      <string>SCCS</string>
     </void>
     <void method="add">
      <string>.razor</string>
     </void>
     <void method="add">
      <string>RCS</string>
     </void>
     <void method="add">
      <string>CVS</string>
     </void>
     <void method="add">
      <string>CVSROOT</string>
     </void>
    </void>
   </void>
   <void id="IgnoredFiles0" property="ignoredFiles">
    <void property="items">
     <void method="add">
      <string>.git</string>
     </void>
     <void method="add">
      <string>.hgtags</string>
     </void>
     <void method="add">
      <string>.hgignore</string>
     </void>
     <void method="add">
      <string>.cvsignore</string>
     </void>
     <void method="add">
      <string>.p4config</string>
     </void>
    </void>
   </void>
  </void>
  <void property="nestingMaximum">
   <int>50</int>
  </void>
  <void property="projects">
   <void method="put">
    <string>collection</string>
    <object class="org.opengrok.indexer.configuration.Project">
     <void property="historyBasedReindex">
      <boolean>true</boolean>
     </void>
     <void property="historyEnabled">
      <boolean>true</boolean>
     </void>
     <void property="indexed">
      <boolean>true</boolean>
     </void>
     <void property="name">
      <string>collection</string>
     </void>
     <void property="path">
      <string>/collection</string>
     </void>
    </object>
   </void>
  </void>
  <void property="projectsEnabled">
   <boolean>true</boolean>
  </void>
  <void property="repositories">
   <void method="add">
    <object class="org.opengrok.indexer.history.RepositoryInfo">
     <void property="branch">
      <string>master</string>
     </void>
     <void property="currentVersion">
      <string>2022-09-12 11:39 -0500 9ec4a9b Source Search Initial project1 commit</string>
     </void>
     <void property="directoryNameRelative">
      <string>/collection/project1</string>
     </void>
     <void property="historyBasedReindex">
      <boolean>true</boolean>
     </void>
     <void property="historyEnabled">
      <boolean>true</boolean>
     </void>
     <void property="type">
      <string>git</string>
     </void>
    </object>
   </void>
   <void method="add">
    <object class="org.opengrok.indexer.history.RepositoryInfo">
     <void property="branch">
      <string>master</string>
     </void>
     <void property="currentVersion">
      <string>2022-09-12 11:41 -0500 e6a53e5 Source Search Initial project2 commit</string>
     </void>
     <void property="directoryNameRelative">
      <string>/collection/project2</string>
     </void>
     <void property="historyBasedReindex">
      <boolean>true</boolean>
     </void>
     <void property="historyEnabled">
      <boolean>true</boolean>
     </void>
     <void property="type">
      <string>git</string>
     </void>
    </object>
   </void>
  </void>
  <void property="scanningDepth">
   <int>50</int>
  </void>
  <void property="sourceRoot">
   <string>/opengrok_sources</string>
  </void>
  <void id="SuggesterConfig0" property="suggesterConfig">
   <void property="allowedFields">
    <void method="clear"/>
    <void method="add">
     <string>defs</string>
    </void>
    <void method="add">
     <string>path</string>
    </void>
    <void method="add">
     <string>hist</string>
    </void>
    <void method="add">
     <string>refs</string>
    </void>
    <void method="add">
     <string>type</string>
    </void>
    <void method="add">
     <string>full</string>
    </void>
   </void>
   <void property="enabled">
    <boolean>false</boolean>
   </void>
  </void>
  <void property="xrefTimeout">
   <long>240</long>
  </void>
 </object>
</java>

Thanks in advance

vladak commented 1 year ago

Is there anything of interest (around the time the request for the file in question fails) in the webapp logs ?

mmmarq commented 1 year ago

Hello @vladak

Not a single word from tomcat log. Should I increase tomcat log level?

Thanks

vladak commented 1 year ago

Hello @vladak

Not a single word from tomcat log. Should I increase tomcat log level?

Thanks

yep, that might be worthwhile.

mmmarq commented 1 year ago

I have changed tomcat log level to FINE. See log snippet in attached file catalina.txt