Closed Dissimilis closed 10 months ago
Thanks for the suggestion! Yeah, at the moment only Porter stemming is supported - the IStemmer
interface is internal because it hasn't currently been designed with extensibility in mind.
You raise an interesting point though; there are other stemming algorithms, not least so that words from languages other than English can be stemmed effectively.
It's definitely something to think about...
Custom stemming will be available in v6
Judging by the code
this.stemmer = new PorterStemmer();
it looks like implementing and passing my own stemmer is impossible.It should be trivial to make API changes allowing to assign custom stemmer in
TokenizationOptions
. But maybe IStemmer would need more thoughts on the design.P.S.
this.stemmer = new PorterStemmer();
is a nice illustration of new is glue :)