andgineer / lexiflux

AI-powered foreign text reader for language learners (Django)
1 stars 0 forks source link

importing war_and_peace.ru.txt bugs #8

Closed andgineer closed 11 months ago

andgineer commented 11 months ago

1) it passed tests ok but when I tried it on Tolstoy with a lot of French inside Russian text, it took 10 fragments to decide this is Russian even though only one French fragment was detected. It should stop even after 3 fragments

DEBUG:root:Detected language: ru
DEBUG:root:Detected language: fr
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru
DEBUG:root:Detected language: ru

2) it did not detected chapters. Partially because it named "Часть", but there are also romun numbered subchapter - why they were ignored?