Closed clsid2 closed 11 months ago
It's failing the valid utf8 check.
You can check it with isutf8 command line tool.
isutf8 libass_failing_sample.srt
libass_failing_sample.srt: line 3, char 24, byte 57: After a first byte between C2 and DF, expecting a 2nd byte between 80 and BF
libass_failing_sample_utf8bom_bomremoved.srt.txt
If I resave the one with BOM without a BOM, it becomes valid, but it's nothing like the binary of the utf8 one.
I see the problem now.
https://github.com/clsid2/mpc-hc/pull/2303
It was using the charset instead of the codepage.
Incidentally...HANGEUL_CHARSET and HANGUL_CHARSET are the same charset. Not sure which is preferred but it's unneeded. I also wonder about OEM, SYMBOL and MAC. These three have no obvious codepage to convert to, but I doubt they are useful anyway.
There are many, many codepages out there. I doubt it, but I wonder if anyone needs a codepage not aligning to these charsets.
Yeah, I think those ones are not really needed or used.
@adipose Problem is with ConvertCPToUTF8
libass_failing_sample.zip
I have limited use of libass to UTF8 in my last commit. So undo that check during testing.