Peratham / semanticvectors

Automatically exported from code.google.com/p/semanticvectors
Other
0 stars 0 forks source link

Error messages when building PositionalIndex with version 1.16 #9

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. build PositionalIndex e.g. with the bible corpus (but not only) -- s.
the next point

What is the expected output? What do you see instead?

That's what I get:
$ java pitt.search.lucene.IndexFilePositions bible_chapters/
java.io.FileNotFoundException:  (No such file or directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)
        at java.io.FileReader.<init>(FileReader.java:55)
        at
org.apache.lucene.analysis.WordlistLoader.getWordSet(WordlistLoader.java:49)
        at
org.apache.lucene.analysis.standard.StandardAnalyzer.<init>(StandardAnalyzer.jav
a:110)
        at
pitt.search.lucene.IndexFilePositions.main(IndexFilePositions.java:43)
Indexing to directory 'index'...
- further the "normal" output

$ java pitt.search.semanticvectors.BuildPositionalIndex -w 15 index/
seedLength = 20
Vector length = 200
Minimum frequency = 10
Window length = 15
Creating term vectors ...
There are 3525 terms (and 1190 docs)
0 ... 1000 ... 
Created 3525 term vectors ...

Normalizing term vectors
Writing term vectors to termtermvectors.bin
About to write vectors to file termtermvectors.bin
0 ... 1000 ... 2000 ... 3000 ... Finished writing vectors.
Write vectors incrementally to file incremental_docvectors.bin
0 ... Exception in thread "main" java.lang.NullPointerException
        at
pitt.search.semanticvectors.IncrementalDocVectors.<init>(IncrementalDocVectors.j
ava:142)
        at
pitt.search.semanticvectors.BuildPositionalIndex.main(BuildPositionalIndex.java:
243)

What version of the product are you using? On what operating system?

Version 1.16, both with Cygwin on Win Vista as well as on MAC OS X (Tiger)

Please provide any additional information below.

When tested with Version 1.8, index building worked without any error messages.
PS. I also found one naming typo on Version 1.8, at least in the .jar
compilation which makes using BuildPositionalIndex a bit challenging -- >
BuildIPositionalndex instead of presumably intended BuildPositionalIndex :)

Original issue reported on code.google.com by eugenie.giesbrecht on 4 Dec 2008 at 1:54

GoogleCodeExporter commented 9 years ago
You're right, I reproduced this problem straight away. Thanks for the heads-up.

I think the problem is due to attempting to add a null term vector to a 
document vector.

I think I've fixed the problem by moving some of the term addition logic inside 
an
existing try/catch clause - committed fix in revision 207.

We'll try to get this tested and included in the next release (which will be 
1.18).
In the meantime, if you check out v 1.17 from source, this should be fixed.

Original comment by dwidd...@gmail.com on 4 Dec 2008 at 5:52

GoogleCodeExporter commented 9 years ago
Marking as fixed, this hasn't been a problem for a while.

Original comment by dwidd...@gmail.com on 8 Jul 2009 at 4:08