coccoc / coccoc-tokenizer

high performance tokenizer for Vietnamese language
GNU Lesser General Public License v3.0
393 stars 123 forks source link

create python bindings #4

Closed txdat closed 5 years ago

txdat commented 5 years ago

using cython module for c++ code wrapping

add segment_general (based on segment_original) in tokenizer/tokenizer.hpp for: