Closed andgineer closed 2 months ago
maybe better to return to naive tokenizing by spaces - it least it's bullet proof
also we do not pass book lang to words extractor - does not matter for naive approach but should be done for NLTK
maybe better to return to naive tokenizing by spaces - it least it's bullet proof