CharsetDetector / UTF-unknown

Character set detector build in C# - .NET 5+, .NET Core 2+, .NET standard 1+ & .NET 4+
303 stars 45 forks source link

Detect encoding from string? #166

Open foxi69 opened 7 months ago

foxi69 commented 7 months ago

Can i get proper encoding from text like this? "Español"

304NotModified commented 7 months ago

Well the encoding in a string is always UTF 16.

.NET uses UTF-16 to encode the text in a string. A char instance represents a 16-bit code unit.

From https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction

foxi69 commented 7 months ago

Okay, and can I convert this to the correct character set? It could be "Español"

Btw it came from Xabe.FFmpeg SubtitleStream.Title prop.