Closed Ayushk4 closed 5 years ago
Can you check this against the original toktok, or against the nltk toktok? Either way I am infavor but if we are dieviating we should document that this is an enhanced version of the toktok tokenizer
Nltk's toktok gives the following output for toktok.tokenize("This is a sentence. ")
['This', 'is', 'a', 'sentence.']
Ok cool. Lets add to the docstring that this is an enhanced version of the orginal toktok tokenzier
Before -
Now -
Also, minor changes in handle_final_periods function, to prevent re-traversing over trailing spaces at the end of the string.