CharsetDetector / UTF-unknown

Character set detector build in C# - .NET 5+, .NET Core 2+, .NET standard 1+ & .NET 4+
303 stars 45 forks source link

Wrong detection of Win-1251 encoding #156

Open PavelFischerCoupa opened 2 years ago

PavelFischerCoupa commented 2 years ago

Hello,

we have found one issue. The library delivers wrong encoding if the text in is Win-1251 encoding, but whole text is in upper case. If the same text is in lower case, detection work perfectly.

Here are the files correct.txt wrong.txt .