Closed asfimport closed 14 years ago
Robert Muir (@rmuir) (migrated from JIRA)
patch with mod to wordlistloader, test, and snowball stoplists for danish, dutch, english, finnish, french, german, hungarian, italian, norwegian, russian, spanish, and swedish
Robert Muir (@rmuir) (migrated from JIRA)
I will commit this in a few days if no one objects. Again i add the getSnowballWordSet to WordListLoader, but if this is inappropriate we could instead have a SnowballWordListLoader in our snowball package or something, doesn't matter to me.
Simon Willnauer (@s1monw) (migrated from JIRA)
Robert, patch looks good except of one thing.
public static HashSet<String> getSnowballWordSet(Reader reader)
it returns a hashset but should really return a Set<String>. We plan to change all return types to the interface instead of the implementation.
Robert Muir (@rmuir) (migrated from JIRA)
thanks Simon, I agree
Robert Muir (@rmuir) (migrated from JIRA)
Committed revision 899955.
Uwe Schindler (@uschindler) (migrated from JIRA)
Hi Robert,
when i changed the backwards tests i added a new param to svn exec task. With this patch it now behaves equal to bw checkouts:
Uwe Schindler (@uschindler) (migrated from JIRA)
Sorry some whitespace issues. Fixed here.
Uwe Schindler (@uschindler) (migrated from JIRA)
Committed Revision: 900160
The snowball project creates stopword lists as well as stemmers, example: http://svn.tartarus.org/snowball/trunk/website/algorithms/english/stop.txt?view=markup
This patch includes the following:
I did not add any changes to SnowballAnalyzer to actually automatically use these lists yet, i would like us to discuss this in a future issue proposing integrating snowball with contrib/analyzers.
Migrated from LUCENE-2206 by Robert Muir (@rmuir), resolved Jan 16 2010 Attachments: LUCENE-2206.patch, LUCENE-2206-checkout-fixes.patch Linked issues:
3131