LMMS / lmms

Cross-platform music production software
https://lmms.io
GNU General Public License v2.0
8.15k stars 1.01k forks source link

Garbled track names after importing Shift-JIS encoded MIDI files #6056

Open xTibor opened 3 years ago

xTibor commented 3 years ago

Steps to reproduce

There are several test files on this random fansite. For example: db-op.mid, jp-country.mid, kor-ed2.mid, slayer-ed.mid, etc.

Expected behavior

I would expect either some heuristics to detect these Shift-JIS encoded files and automagically convert their track names to Unicode, or a combobox at the MIDI import dialog to select this encoding manually.

Actual behavior

Shift-JIS encoded MIDI track names interpreted as Unicode causing garbled track names.

Screenshot

20210615_211813

Affected LMMS versions

AppImage, Version 1.2.2 (Linux/x86_64, Qt 5.9.7, GCC 5.4.0 20160609) Git, fadf8c611ef37cf51fa0dd5c0d188a314c51004b

Veratil commented 3 years ago

There's no way to detect exactly what encoding the text is saved in without extra information: either metadata or outside information. As far as I'm aware, MIDI doesn't include what encoding track names are in. If it does though, I'm willing to add that to MidiImport. As for outside information, I don't think we want to add "expect this encoding" to the import dialog.

Also QString expects UTF-8 and UTF-8 only. I'd say we could try to use QTextCodec, but even then it only tries to detect UTF encodings. For anything else you have to manually give the codec info, and even then there's no way to know if it decodes correctly.

I'll take a look at the midi's when I can and see if there's any other info in them that can give what encoding the text is saved in, but I'm not certain we can fix this.

For some extra reading: https://en.wikipedia.org/wiki/Charset_detection