Closed JeffreyYStewart closed 2 years ago
The problem was that the code was trying to modify the language default stop words after the model instance had been created. I have changed the code to modify the terms in the model instance. This should fix the bug identified. You can also now supply a single stop word as a string or set remove_stopwords=True
to remove all stop words.
In tokenizer.makedoc the stopwords passed through remove_stopwords are not actually removed as stopwords.