Open EvGe22 opened 5 months ago
Given a text in Ukrainian, two methods provide two completely different results.
detector = LanguageDetectorBuilder.from_all_languages().build() string = "Що найбільше подобається читачам у жанрі \"Фентезі\"?" print(detector.compute_language_confidence_values(string)) >>> [ConfidenceValue(language=Language.KAZAKH, value=1), ConfidenceValue(language=Language.AFRIKAANS, value=0), ConfidenceValue(language=Language.ALBANIAN, value=0), ...] print(detector.detect_multiple_languages_of(string)) >>> [DetectionResult(start_index=0, end_index=51, word_count=7, language=Language.UKRAINIAN)]
Both methods use different algorithms, so this can happen. I will try to improve them with each new release.
Given a text in Ukrainian, two methods provide two completely different results.