andgineer / lexiflux

AI-powered foreign text reader for language learners (Django)
1 stars 0 forks source link

NLTK tokenized skip whole paragrafs #54

Closed andgineer closed 2 months ago

andgineer commented 2 months ago

maybe better to return to naive tokenizing by spaces - it least it's bullet proof

andgineer commented 2 months ago

also we do not pass book lang to words extractor - does not matter for naive approach but should be done for NLTK