Closed helt closed 5 years ago
We once had a StopwordAnnotator, but it seemed to be basically just a dictionary annotator - so we removed it. The StopwordRemover does not annotate stopwords, it purges Token annotations and related annotations from the document.
You could consider using the DictionaryAnnotator to annotate stopwords for your task.
does it feature that fancy language dependent file loading mechanism?
e.g. [de]classpath:/stopwords/en_articles.txt
No, but the code of both components is pretty straightforward. It should be easy to port that functionality and we're always happy to accept contributions :) (
From the documentation of
dkpro-core-stopwordremover-asl
it is not clear to me, if the stopword-remover is capable of just annotating tokens with a stopword annotation or if it always removing those annotations.I.e. i have a use case where i have to able to check if all tokens under an ngram are stopwords... The documentation reads (for me) like i am not able to achieve this with the stopword remover...