coccoc / coccoc-tokenizer

high performance tokenizer for Vietnamese language
GNU Lesser General Public License v3.0
394 stars 125 forks source link

What option available for `tokenize_option` in Python binding ? #26

Open ttpro1995 opened 1 year ago

ttpro1995 commented 1 year ago
print(T.word_tokenize("xin chào, tôi là người Việt Nam", tokenize_option=0))

What tokenize_option available can I pass into tokenize_option.

anhducle98 commented 1 year ago

3 options as shown in: https://github.com/coccoc/coccoc-tokenizer/blob/master/tokenizer/tokenizer.hpp#L20