Lucky-Dhakad / semanticvectors

Automatically exported from code.google.com/p/semanticvectors
Other
0 stars 1 forks source link

Using document pathnames with upper case letters as search terms fails #6

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Build a model using the Bible corpus. 
2. Search for (e.g.) java pitt.search.semanticvectors.Search -q
docvectors.bin bible_chapters/Matthew/Chapter_3

What is the expected output? What do you see instead?
You get a response "Didn't find vector for
bible_chapters/matthew/chapter_3", instead of search results.

The problem is that the code in Search.java lower cases terms as promised
in the solution to issue 4; but the code in DocVectors.java doesn't. This
is a problem - we can't just lower case pathnames because this will cause
file lookups to break on many Unix-style systems.

Original issue reported on code.google.com by dwidd...@gmail.com on 26 Jun 2008 at 1:48

GoogleCodeExporter commented 9 years ago
Added a "-lowercase" flag to command line options for Search.java so that 
search code
can choose not to lower case. (Lower casing all search arguments remains the 
default.)

DocVector builders do not lower case and should not in general.

Original comment by dwidd...@gmail.com on 14 Aug 2008 at 3:18

GoogleCodeExporter commented 9 years ago
In case anyone finds this, the flag changed from -lowercase to -matchcase 
(false by default). 

Original comment by widd...@google.com on 22 Nov 2010 at 3:39