clementine-player / Clementine

:tangerine: Clementine Music Player
https://www.clementine-player.org/
GNU General Public License v3.0
3.72k stars 671 forks source link

Allow specifying encoding for both ID3v1 & ID3v2 tags for the cases when the octet series is not valid UTF-8 #1365

Open Clementine-Issue-Importer opened 10 years ago

Clementine-Issue-Importer commented 10 years ago

From Taras.Puchko on January 26, 2011 22:00:39

Since charset detection cannot be made to always work reliably, I'd like to have an option to disable it and specify the encoding manually, at least for the cases when the octet series is not valid UTF-8 or UTF-16.

For instance Clementine 0.6 cannot detect cp1251 in ID3v2.3.0 of the following file: http://www.sofiarotaru.com/download_music/06-track6.mp3 This file has a title "Трек 6" in ID3v2.3.0 encoded using cp1251, which is against the spec, but there are many files like that and (in this case) the binary sequence can be easily detected as not valid UTF-8 or UTF-16.

Original issue: http://code.google.com/p/clementine-player/issues/detail?id=1365

Clementine-Issue-Importer commented 10 years ago

From misisko1 on March 10, 2011 12:55:11

I agree with Taras.Puchko - having same problems. I have some mp3 (id3 v2.3 or v2.4) with cp1251 encoding and special characters like - for example 'š' , 'č', 'ž' are decoded with UTF-8. This make my library totally untidy.

Clementine-Issue-Importer commented 10 years ago

From oleg.voropaev on March 28, 2011 20:27:50

cp1251 support especially needed for online streaming radios, because taglib is not used for streams, and patched taglib-rcc won't help with this issue.

Clementine-Issue-Importer commented 10 years ago

From misisko1 on March 31, 2011 00:31:13

I guess this issue can be closed. Mp3 tags' encoding is now correct. But i don't notice in which revision it was fixed (2950 - 3090 ?)

Clementine-Issue-Importer commented 10 years ago

From Taras.Puchko on March 31, 2011 03:19:16

What I want is an option to override the encoding for the cases when the standard says it's ISO-8859-1. This can happen for both ID3v1 & ID3v2 (when the encoding byte is 0).

There should also be an option to enable/disable the charset detection.

BTW, Clementine 0.7 still cannot detect Windows-1251 for http://www.sofiarotaru.com/download_music/06-track6.mp3

Clementine-Issue-Importer commented 10 years ago

From Taras.Puchko on May 05, 2011 09:15:59

This issue can be closed.

I've solved this problem by installing libtag1-rusxmms on Ubuntu and selecting the Ukrainian language as the system language or creating a file ~/.rcc/xmms.xml with the following contents:

uk
Clementine-Issue-Importer commented 10 years ago

From oleg.voropaev on May 05, 2011 09:22:01

No, it is not solved. Because taglib is not used for streams.

Clementine-Issue-Importer commented 10 years ago

From oleg.voropaev on May 05, 2011 09:22:39

Example: http://mp3.nashe.ru:80/nashe-192

Clementine-Issue-Importer commented 10 years ago

From MurzNN on January 06, 2012 12:19:41

Confirm problem with Russian streams, rusxmms didn't solve the probem.

Clementine-Issue-Importer commented 10 years ago

From stepchik27 on February 22, 2012 22:28:00

Has anyone solved the problem with Tag encoding (Russian Streams)?

Streams name looks like "????? FM" (Must be Бизнес ФМ)

Clementine-Issue-Importer commented 10 years ago

From basil.peace on May 18, 2012 23:58:56

Problem with encoding in tags still exists in current version for Windows 1.0.1. In my case, cp1251 in tags wasn't detected, although it is system default ANSI (non-Unicode) encoding.

To my mind, at least the following rule should apply: if loading tag isn't in UTF-8, it should be considered as system encoding. Of course, manual configuration also would be good.