sabitaacharya / semanticvectors

Automatically exported from code.google.com/p/semanticvectors
Other
0 stars 0 forks source link

Building term and document vectors #80

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I recently build a custom Luence Index file of PCAP ( Packet Capture of network 
data) dump file In order to do that I followed  steps in 
org.apache.lucene.demo.IndexFiles.I was able to create index files ie.

_0.cfe  _0.si   _1.cfs  _2.cfe  _2.si   _3.cfs  _4.cfe  _4.si   _5.cfs  _6.cfe  
_6.si       segments.gen
_0.cfs  _1.cfe  _1.si   _2.cfs  _3.cfe  _3.si   _4.cfs  _5.cfe  _5.si   _6.cfs  
segments_1

Now I am trying to crate term and document vector I gave following command i.e
java pitt.search.semanticvectors.BuildIndex -luceneindexpath $INDEX_MADE_ABOVE.

However I get following error:
java -Xms2048m -Xmx4096m -cp ./semanticvectors-5.4.jar 
pitt.search.semanticvectors.BuildIndex -luceneindexpath  $INDEX
Seedlength: 10, Dimension: 200, Vector type: REAL, Minimum frequency: 0, 
Maximum frequency: 2147483647, Number non-alphabet characters: 2147483647, 
Contents fields are: [contents]
Creating term vectors as superpositions of elemental document vectors ...
Creating semantic term vectors ...Exception in thread "main" 
java.lang.NullPointerException
        at pitt.search.semanticvectors.TermVectorsFromLucene.trainTermVectors(TermVectorsFromLucene.java:134)
        at pitt.search.semanticvectors.TermVectorsFromLucene.createTermVectorsFromLuceneImpl(TermVectorsFromLucene.java:123)
        at pitt.search.semanticvectors.TermVectorsFromLucene.createTermVectorsFromLucene(TermVectorsFromLucene.java:97)
        at pitt.search.semanticvectors.BuildIndex.main(BuildIndex.java:109)
. 

To build index I used Lucene 4.6.1 I am working on centos 6.

Your answers will be appreciated.

Thank you.
Santos

Original issue reported on code.google.com by rangeli....@gmail.com on 14 Aug 2014 at 5:10

GoogleCodeExporter commented 8 years ago
Please disregard the defect that I lodged. For some reasons I thought field 
name can be anything. If we follow names and schemes in  
org.apache.lucene.demo.IndexFiles things goes ok.

Thank you.

Original comment by rangeli....@gmail.com on 14 Aug 2014 at 6:37

GoogleCodeExporter commented 8 years ago
Oh - glad you fixed it! I had just realized that field names may be an issue, 
and checked in some code that instead of giving an undocumented NPE will give 
an exception message from the LuceneUtils class saying:

Exception in thread "main" java.lang.NullPointerException: No terms for field: 
'foo'.
Known fields are: 'path, modified, contents'.

Original comment by dwidd...@gmail.com on 14 Aug 2014 at 7:43