Sonerezh / sonerezh

A self-hosted, web-based application to stream your music, everywhere.
https://www.sonerezh.bzh
GNU Affero General Public License v3.0
766 stars 122 forks source link

Changing Charset. #320

Closed triDcontrols closed 7 years ago

triDcontrols commented 7 years ago

Trying to add russian music to sonerezh, it imports just fine, and I can play it back, but the text is scrambled, see screenshot. I know this is a charset issue but do not know if this is a nginx, or php or sonerezh issue and if where I can adjust the charset.

Runing Sonerezh on Ubuntu 16.04 LTS, with latest version of NginX, PHP 7, MariaDB.

Screenshot: https://ibb.co/cGqxkF

gs11 commented 7 years ago

Is there any public domain track with Russian metadata you could share for testing purposes?

triDcontrols commented 7 years ago

Yes, here's a dropbox link to one of the songs.

https://www.dropbox.com/s/lniyab4z60gdl58/%D0%93%D0%BE%D1%81%D0%BF%D0%BE%D0%B4%D0%B8%20%D0%91%D0%BE%D0%B6%D0%B5%20%D0%BF%D0%BE%D0%BC%D0%B8%D0%BB%D1%83%D0%B9.mp3?dl=0

gs11 commented 7 years ago

Thanks, the characters look fine here. image

However, the import process says "Metadata are unreadable or empty. Trying to import anyway...". Also, the mp3 file only contains ID3v1 tags without any russian characters...just "no artist/no title/Audio Track 09". I'm guessing the title is somehow parsed from the filename but there's something strange going on with the import process:

Database source_path: /data/Music/sonerezh/ Боже помилуй.mp3 Real source path: /data/Music/sonerezh/Господи Боже помилуй.mp3

triDcontrols commented 7 years ago

Hmm, Interesting, Try this one, The link below is the song from my original post, the file name is not being recognized in sonerezh but is recognized in mac os x, but when I "GET Info" using mac os x, it also can't decipher the artist info. https://ibb.co/hup5s5

https://www.dropbox.com/s/cjzgpbscvz6arfe/01-%D0%A2%D0%B2%D0%BE%D1%8F%20%D0%BB%D1%8E%D0%B1%D0%BE%D0%B2%D1%8C.mp3?dl=0

MightyCreak commented 7 years ago

Database source_path: /data/Music/sonerezh/ Боже помилуй.mp3 Real source path: /data/Music/sonerezh/Господи Боже помилуй.mp3

Feels like a regular strlen has been used on a UTF8 string...

gs11 commented 7 years ago

No editor/player of mine seems to be able to show the tags correctly. However, some digging in the files reveals that the ID3v2 tag encoding is Windows-1251. I don't know the tag specification well enough, but is there any charset indicator in the file for the parser to know which one to use?

gs11 commented 7 years ago

Looking further into the issue, it seems that using Windows-1251 encoding for IDv3 tags is not valid: https://stackoverflow.com/questions/16941269/gibberish-result-reading-unicode-tags-using-mp3agic-in-java/20000353

I'd recommend you into converting these tracks using Unicode tags instead. There seems to be a number of tools that provides this functionality.

triDcontrols commented 7 years ago

Thanks for the help, found few utilities will give them a go tonight, just tried it on the 2nd song and used foobar2000 with foo-chocon addon and was able to select 1251 encoding and changing it to UTF8, and now it displays just fine and windows and mac shows the file correctly. Thx for the help gs11. This can be closed.