Closed KienPM closed 4 years ago
Python bindinings code doesn't define needed constants at the moment, so a quick solution for -u would be:
from CocCocTokenizer import PyTokenizer
t = PyTokenizer()
res = t.word_tokenize(text, 2)
And for -n the quick solution would be:
from CocCocTokenizer import PyTokenizer
t = PyTokenizer(False)
res = t.word_tokenize(text)
For more proper solution feel free to fix python extension code (.pyx file) and send us a pull request, we will happily review and include it.
Thanks for your answer!
This is a great project! Can you please provide document guiding how to use options such as -t, -u... when using Python binding? Thank you so much!