Open JaViLuMa opened 4 years ago
Hello. I had a task to detect languages for certain sentences.
Let's say we have this sentence: ZANIMA ME CENA PREMIUM HIŠIC, BLIZU MORJA, IMAMO TUDI PSA. this is the output:
But if I convert it to sentence case (Zanima me cena hišic, blizu morja, imamo tudi psa.), output is MUCH different:
I know this issue is easy to fix, but I think this behavior is and was not intended.
Has anyone done anything better than: detect(TEXT_with_Capital_Letters.lower()) ?
detect(TEXT_with_Capital_Letters.lower())
I think it will almost never degrade accuracy if we make the string lower-case before feeding it into the algorithm.
Hello. I had a task to detect languages for certain sentences.
Let's say we have this sentence: ZANIMA ME CENA PREMIUM HIŠIC, BLIZU MORJA, IMAMO TUDI PSA. this is the output:
But if I convert it to sentence case (Zanima me cena hišic, blizu morja, imamo tudi psa.), output is MUCH different:
I know this issue is easy to fix, but I think this behavior is and was not intended.