Closed klockeph closed 6 years ago
Can you validate it on python3.x as well?
A quick test showed that python3 currently has no problems. But adding the u does not break anything, at least for the (rather small) tests that I just did.
@fabianvf please merge PR #29 and then merge this on top of it
@fabianvf yo?
The regex-string is not in Unicode, thus the \u... control sequence does have unexpected behaviour. Just try split_sentences("restaurant"), it will return ["resta", "rant"], which is obviously bad.
Adding a simple u to the Regex, will force python to interpret it in unicode and fix this issue.
Tested with python2.7