wellington-junio / gitblit

Automatically exported from code.google.com/p/gitblit
Apache License 2.0
0 stars 0 forks source link

Missing blob when indexing a repository with a submodule #119

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Setup gitblit and push the contents of a specific git repository 
(unfortunately with proprietary contents) into a repository managed by gitblit
2. Switch on indexing in one of the branches of this repository

What is the expected output? What do you see instead?

Expected: The branch should be searchable and no error messages should appear 
in the log.
Actual: We get the following message in the log:

ERROR Exception while reindexing foo.git
org.eclipse.jgit.errors.MissingObjectException: Missing blob 
4bd2ec7f20cda43a03aa6d190a9313727383339d
    at org.eclipse.jgit.storage.file.WindowCursor.open(WindowCursor.java:126)
    at org.eclipse.jgit.lib.ObjectDatabase.open(ObjectDatabase.java:176)
    at org.eclipse.jgit.lib.Repository.open(Repository.java:273)
    at com.gitblit.LuceneExecutor.reindex(LuceneExecutor.java:557)
    at com.gitblit.LuceneExecutor.index(LuceneExecutor.java:193)
    at com.gitblit.LuceneExecutor.run(LuceneExecutor.java:173)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:679)
ERROR Could not build foo.git Lucene index!

What version of the product are you using? On what operating system?
Gitblit: 1.0.0
OS: CentOS 6.2

Please provide any additional information below.
The repository size is 226MB. I have started java with "-Xmx10240M".

I have tried to do the same with other repositories without any problems. 
Unfortunately I cannot shared the repository.

Original issue reported on code.google.com by peter@peca.dk on 5 Aug 2012 at 8:37

GoogleCodeExporter commented 9 years ago
Let's track down this blob.  If you run the following command on the source 
repository, do you get a result?

git ls-files --stage | grep 4bd2ec7f20cda43a03aa6d190a9313727383339d

Can you repeat the same test on the Gitblit copy of the repository?  Do you get 
the same result?

Original comment by James.Mo...@gmail.com on 6 Aug 2012 at 1:16

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I thought I might have forgotten where the initial push was done from. In order 
to remove this uncertainty I setup a test gitblit on my development machine, 
and repeated the steps.

This time I pushed from a non-bare repository. I got the following output when 
running the command you suggested on the development machine:

160000 c39e9d6693c303db7d9969b1196bfb35fad79452 0   foo-web-client/bar

The "foo-web-client/bar" directory is a git submodule. I guess the existence of 
a git submodule is what differentiates our real repository from the tests I 
have done. The "git ls-files --stage" command still does not give any output on 
the repository handled by Gitblit.

Is the indexing supposed to work with submodules?

Original comment by peter@peca.dk on 6 Aug 2012 at 1:48

GoogleCodeExporter commented 9 years ago
Ah.  That is likely the problem.  I had not considered submodules and I have 0 
experience with them.

Original comment by James.Mo...@gmail.com on 6 Aug 2012 at 2:13

GoogleCodeExporter commented 9 years ago

Original comment by James.Mo...@gmail.com on 6 Aug 2012 at 9:44

GoogleCodeExporter commented 9 years ago
I think the best way to handle git submodules is to ignore them. They are 
separate repositories and can be added as repositories in gitblit the same way 
that the repository referencing them has been added.

If you agree with this, the issue can be fixed by checking the mode of the 
objects being added to the TreeMap of objects to be indexed, and ignore it if 
the mode is FileMode.GITLINK.

I have made this change in this repository: 
http://code.google.com/r/peter-issue119/. (I don't know if this is the 
preferred way of sending patches to the project).

It seems to be working in my case.

Original comment by peter@peca.dk on 7 Aug 2012 at 11:15

GoogleCodeExporter commented 9 years ago
That sounds reasonable.  I merged your commit and also took care of the other 
indexing case.  Thanks for your contribution!  Sometimes I shake my head when I 
receive actual patch files instead of a link to someone's clone which I can 
pull from - afterall this is git?!  I did not realize GoogleCode offered 
personal clones, I guess they are feeling the pressure of GitHub.

Original comment by James.Mo...@gmail.com on 7 Aug 2012 at 12:18

GoogleCodeExporter commented 9 years ago

Original comment by James.Mo...@gmail.com on 20 Aug 2012 at 2:06

GoogleCodeExporter commented 9 years ago
Fix/change released in 1.1.0.

Original comment by James.Mo...@gmail.com on 25 Aug 2012 at 12:20