Cutezjz / galagosearch

Automatically exported from code.google.com/p/galagosearch
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Stopwords usage is not documented #18

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
While a stopword list can be supplied during retrieval, this isn't in the
documentation.

Use this as a template for documentation:

To use the stopword remover, add an XML like this to a parameter file
when running queries:

<traversals>
 <traversal>

<class>org.galagosearch.core.retrieval.traversal.RemoveStopwordsTraversal</class
>
    <order>before</order>
    <parameters>
       <word>the</word>
       <word>but</word>
    </parameters>
 </traversal>
</traversals>

For instance:

<parameters>
  <traversals>
  ...
  </traversals>
  <query>
   ...
  </query>
  <query>
  ...
  </query>
  ...
</parameters>

Original issue reported on code.google.com by trevor.s...@gmail.com on 10 May 2009 at 9:26

GoogleCodeExporter commented 8 years ago

Stopwords can now be input at query time the remove stopword traversal is 
included in the standard traversal set.

Stopwords can be set:
<stopwords>
 <word>a</word>
  ...
</stopwords>

alternatively:
<stopwords>/path/to/stopword/file</stopwords>

This xml fragment should be input in the query parameter file.

Original comment by sjh...@gmail.com on 21 Jun 2011 at 3:24