hstonel / gitblit

Automatically exported from code.google.com/p/gitblit
Apache License 2.0
0 stars 0 forks source link

Lucene query overflows Jetty header parser #152

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Add a lot of repsitories, in my case 209
2. Enable lucene indexing of these repo
3. Go to search page
4. Select all repositories and enter a word
5. Press "Search"
6. After result appears, press "Search" again

What is the expected output? What do you see instead?
I get an empty page. Wireshark tells me only a small header is returned and no 
content.

 Content-Length: 0
 Connection: close
 Server: Jetty(7.6.7.v20120910)

Gitblit writes the following on stdout:
 WARN  HttpParser Full for SCEP@60b6980c{l(/10.9.0.2:51310)<->r(/10.1.2.242:8081),d=true,open=true,ishut=false,oshut=false,rb=false,wb=false,w=true,i=1r}-{AsyncHttpConnection@3dc6df34,g=HttpGenerator{s=0,h=-1,b=-1,c=-1},p=HttpParser{s=-1,l=723,c=-3},r=3} 

What version of the product are you using? On what operating system?

Version:v1.1.0-102-gc658df9
OS: OS X 10.8.2
Java: OpenJDk  1.6.0_24
Packaging: GitblitGO
Java -version:
 java version "1.6.0_24"
 OpenJDK Runtime Environment (IcedTea6 1.11.4) (rhel-1.49.1.11.4.el6_3-x86_64)
 OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Please provide any additional information below.

The first search succeed, presumably because the query data is sent with a POST 
command. The second query is sent via a GET command. The URL is 5110 bytes long.

Original issue reported on code.google.com by robin.ro...@gmail.com on 18 Oct 2012 at 10:35

GoogleCodeExporter commented 9 years ago
Yeah, alright maybe I shouldn't use the GET.  This particular form has caused 
me some trouble before with Wicket, Tomcat, nested/grouped repositories, and 
sessions.  I'll take another look at this.

Original comment by James.Mo...@gmail.com on 26 Oct 2012 at 12:07

GoogleCodeExporter commented 9 years ago
Hi Robin,
When you have time, can you pull my latest code and see if this commit 
addresses your issue?

https://github.com/gitblit/gitblit/commit/1b84d3110165c0df933a54a03503beaa829034
7d

-J

Original comment by James.Mo...@gmail.com on 8 Nov 2012 at 9:10

GoogleCodeExporter commented 9 years ago
Works very well. Thank you.

Original comment by robin.ro...@gmail.com on 16 Nov 2012 at 9:09

GoogleCodeExporter commented 9 years ago
I replied too quickly. Gitblit now only lists repositories whose name are in 
lowercase only, so the number of repos is hugely reduced. E.e. I cannot confirm 
the fix.

Original comment by robin.ro...@gmail.com on 16 Nov 2012 at 10:56

GoogleCodeExporter commented 9 years ago
Then I cherry-picked the fix onto v1.1.0 where I can confirm the fix. Thanks.

Original comment by robin.ro...@gmail.com on 16 Nov 2012 at 11:35

GoogleCodeExporter commented 9 years ago
Lowercase repository names.

Do you have repositories that have the same name but differ in case?  That will 
be a problem.

The repository list cache (concurrent hashmap) now stores names (keys) as 
lowercase.  I could also maintain a preserved-case list for things like search 
box selection - but regardless, the same number of repos should be listed.  Is 
it really reduced or just not what you expected to see?

Original comment by James.Mo...@gmail.com on 16 Nov 2012 at 1:05

GoogleCodeExporter commented 9 years ago
Yes, there are such repositories, not sure if they make sense. I think someone 
just didn't like the name and created a new repo. There are also the exact same 
name, but in different directories. That works in 1.1.0, for the cases I know 
of.

The issue I was thinking of is that gitblit in master lowercases the directory 
names it is looking for so it only finds the names that are in lower case.

This is from the log: (the directoy is named hackerhelg/MOVIE.git )
ERROR Repository "hackerhelg/movie.git" is missing! Removing from cache.

Since you base the repository list on names from a potentially case sensitive 
source you need a different solution, i.e. case-insensitive sorting (and 
completion?) so the user gets e.g. the lower case entry when typing, but can 
see that there is another item to select using keyboard. Typing uppercase could 
be interpreted as a request for case matching. This is perhaps s bit OT. 

Original comment by robin.ro...@gmail.com on 16 Nov 2012 at 2:11

GoogleCodeExporter commented 9 years ago
I've opened a new issue for reviewing the repository cache strategy for 1.2.0. 
(issue 172)

I'm on the fence as to whether to continue supporting repositories with 
case-insensitive-identical paths:

e.g.
sample/myrepo.git
sAmPle/MyRepo.git

I can't think of a good reason why you would want to have that aside from "it 
could be done".

I am queuing this issue (search form) as prepared for release.

Original comment by James.Mo...@gmail.com on 27 Nov 2012 at 10:41

GoogleCodeExporter commented 9 years ago
Note that the *real* problem is not that there is one repo which comes in two 
case flavors, but that none of the repos with uppercase letters show up. If it 
was just that one repo I could rename it and forget about it for a while at 
least. 

Original comment by robin.ro...@gmail.com on 28 Nov 2012 at 7:26

GoogleCodeExporter commented 9 years ago
Ah.  That is a good clarification.

Original comment by James.Mo...@gmail.com on 28 Nov 2012 at 12:40

GoogleCodeExporter commented 9 years ago
Hi Robin. I found the case problem with the cache and it is fixed:

https://demo-gitblit.rhcloud.com/commitdiff/gitblit.git/2d85d43d61518d26be33a0e7
759a3d4f4a627452

Original comment by James.Mo...@gmail.com on 21 Dec 2012 at 10:18

GoogleCodeExporter commented 9 years ago
v1.2.0 has been deployed.

Original comment by James.Mo...@gmail.com on 1 Jan 2013 at 1:06