Open miso-belica opened 11 years ago
How can I use spaCy's tokeniser models? Is there a detailed documentation of the package?
spaCy is worse it uses the huge tensor flow.
How to add support for custom language/library - https://miso-belica.github.io/sumy/how-to-add-new-language.html
I use NLTK to tokenize text into sentences & words. But that's big package. Maybe something smaller would be better. Something like https://bitbucket.org/trebor74hr/text-sentence/overview