rth / vtext

Simple NLP in Rust with Python bindings
Apache License 2.0
146 stars 11 forks source link

Add Regexp tokenizer #18

Closed rth closed 5 years ago

rth commented 5 years ago

This adds a Regexp tokenizer, which is, for instance, the default tokenizer in scikit-learn.