Closed klemens-u closed 6 years ago
As written on the README, uchardet moved. This has not been the official repository anymore for at least 2 years now. Uchardet is now a Freedesktop project. Please open reports there: https://gitlab.freedesktop.org/uchardet/uchardet/issues
This beeing said, uchardet works in a statistic way. It is basically impossible to detect a "language" or a charset for single characters. These can be just anything. Binary wise, these will be a few random bytes and result can only be random (if it were to return the right encoding, then that would be the strange part!). So yeah, uchardet needs a sentence, or at the very least several words to have enough to guess the right encoding.
So I will close this report. Feel free to reopen one at the Freedesktop gitlab, if relevant, by considering that uchardet will never be able to detect encoding of a single character (and no system technically will ever be able to, if not by chance).
Test in shell:
echo -n ä | uchardet -> TIS-620
echo -n ö | uchardet -> TIS-620
echo -n ü | uchardet -> ISO-8859-7
Upper case works ok. Ä,Ö,Ü and also ß
System: Ubuntu 16.04