Open chrishokamp opened 9 years ago
by the way, we have ngrams extractor: https://github.com/qe-team/marmot/blob/master/marmot/util/ngram_window_extractor.py Should I convert it to feature extractor? It can be used to extract the token itself, if you set window_size to 0.
I think just wrap it from a feature extractor, since we already use it as a utility for other feature extractors in word-level.
On Wed, Feb 18, 2015 at 7:05 PM, varvara-l notifications@github.com wrote:
by the way, we have ngrams extractor: https://github.com/qe-team/marmot/blob/master/marmot/util/ngram_window_extractor.py Should I convert it to feature extractor? It can be used to extract the token itself, if you set window_size to 0.
— Reply to this email directly or view it on GitHub https://github.com/qe-team/marmot/issues/23#issuecomment-74925843.
we don't have a feature extractor for the word itself, or for stemmed representations, suffixes, prefixes, etc...
these should be easy to implement