fnl / segtok

Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features.
http://fnl.es/segtok-a-segmentation-and-tokenization-library.html
MIT License
170 stars 22 forks source link