apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.61k stars 1.02k forks source link

Regenerate Snowball code so its not so heavy [LUCENE-4279] #5347

Closed asfimport closed 12 years ago

asfimport commented 12 years ago

Spinoff from #4914 (and several threads on the list)

Currently each SnowballStemmer is pretty heavy since each instance also contains a bunch of Among objects (part of the stemming rules).

This normally shouldnt be a problem, except it seems challenging for tomcat users to tune their threadpools (basically they are creating lots of tokenstreams, so lots of SnowballStemmers)

Newer snowball just makes these static, and its easy enough to just regenerate so these aren't so heavy, it doesnt fix the real problem but it also doesn't hurt.


Migrated from LUCENE-4279 by Robert Muir (@rmuir), resolved Aug 01 2012 Attachments: LUCENE-4279.patch

asfimport commented 12 years ago

Robert Muir (@rmuir) (migrated from JIRA)

patch: no need to regenerate the ones from the website that arent in the package as they already work this way (Irish/Basque/Catalan/Armenian)

I also added a thread safety test (just checkRandomData against all the languages).