Open WonderRat opened 8 years ago
Have you tried switching your terminal to a Truetype font (e.g. Lucida Console, or Consolas)? That should correct the output in the ID3v2 case.
Information contained in ID3v1 is treated by id3 as encoded in ISO-8859-1 (a subset of cp1252).
ISO-8859-1 and cp1252 don't have cyrillic letters, so all russian strings in ID3v1 are written in cp1251 (yes, old mp3s, but they still come across). Why don't treat ID3v1 as ANSI? English users will not suffer from that - their ANSI code page will be 1252. WinAPI have alias CP_ACP (0x0) for that - real code page depends from locale settings. https://msdn.microsoft.com/en-us/library/dd374130%28v=vs.85%29.aspx My players and tag editors treat them as ANSI.
Have you tried switching your terminal to a Truetype font
It works, but i like my raster font (modified 8x16, not that in the screenshot). I don't like Lucida Console, or Consolas as console font and don't need display all unicode symbols in console. I thought windows console programs should using OEM code page in first place (because it default) - like DOS programs. May be recoding option in commandline?
In the end, I want id3 to work on Windows as it does on Linux/BSD:
C:\> id3 file.mp3
File: file.mp3
Metadata: ID3v2.3
Title: Something from Japan
Artist: 日本語
Until then, using the ANSI codepage makes more sense to me: if I redirect the output of id3 to a file, I expect to be able to read it using notepad. Commandline arguments are encoded in the ANSI codepage, as is the filesystem, etc. The OEM codepage to me is a relic from the Win3.x/Win9x days (which relied on DOS for its console); AFAICT it is only really necessary if you use the console full-screen.
So, I am going to finish the Unicode-build first; then we'll see how that functions in a console with a non-Truetype font. But supporting that is really low on my priority list.
WXPSP3
The Russian text is written in ID3V1 are encoded in CP1251 but ID3 shows nonsense (i expect output in 866 - its russian OEM codepage):
I suspect problem in charconv.cpp in "template<> conv<>::data conv::decode(const char* s, size_t len)".
Strings from ID3V2 (russian text in unicode) printed in wrong codepage:
Its 1251 shown as 866.
If i change console codepage to 1251 and recode output from 1251 to 866, then text is correct:
http://i.imgur.com/6Fe7LO2.png
samples.zip
russian2_1251.txt, russian2_correct_866.txt - redirected output