owncloud / music

:notes: Music app for ownCloud
GNU Affero General Public License v3.0
566 stars 197 forks source link

Non-ASCII characters and Encoding Issue #882

Closed wonderfulShrineMaidenOfParadise closed 3 years ago

wonderfulShrineMaidenOfParadise commented 3 years ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Add any music with non-ascii charcters. For example, 永夜抄 ~ Eastern Night..
  2. See the library. There is a theme ‰i–鏴@` Eastern Night..

Expected behavior There should be a theme shown as 永夜抄 ~ Eastern Night.

Screenshots Screenshot_20210824_040835

Server (please complete the following information):

paulijar commented 3 years ago

Hi. The application should be able to handle any UTF-8 data as demonstrated by the following screen shot from my test library. Are you sure that your file is actually using UTF-8 encoding on its metadata? If you can share a sample of such a problematic file, I can take a look. You can find my email address from my github profile.

image

wonderfulShrineMaidenOfParadise commented 3 years ago

Okay. I have mailed you with the theme 永夜抄 ~ Eastern Night..wav attached.

paulijar commented 3 years ago

Thanks for the file. Indeed it seems that the file contains metadata encoded in some non-standard way. The thing is, this file contains the title, artist name, and album name in its RIFF headers, but those headers are supposed to be encoded using ISO8859-1. This encoding, on the other hand, can only store Latin-based scripts. But now this file uses some other encoding there. It doesn't seem to be UTF8, either, but even if it was the file would still be violating the standard.

I viewed the file also with the Mp3tag software and this is how it showed up there: image

However, this isn't quite the whole story yet. I noticed that it's possible to write more proper Japanese tags to the WAV file with the Mp3tag application. When this is done, it uses ID3v2.3-type tags for the UTF-8-encoded data while simultaneously filling the RIFF headers with substitute strings like "??? ~ Eastern Night". That is, anything not representable in ISO8859-1 is substituted with a question mark there. This is reasonable but unfortunately it didn't work with the Music app all that well: the Music app found both the RIFF headers and the ID3v2.3 tags but chose to show the name "??? ~ Eastern Night" from the RIFF header instead of the more appropriately filled ID3v2.3 tag. I still need to investigate if there would be any saner way to arbitrate between the different tags formats.

wonderfulShrineMaidenOfParadise commented 3 years ago

Now I know why it happens. Thank you for answering👍

paulijar commented 3 years ago

The new Music v1.3.2 release now contains a fixed version of the getID3 library where WAV files with non-Latin characters work better. Files like in the opening post (*) still cannot be supported, because that is basically not possible by following the standards. However, the WAV files containing ID3v2 tags should now be shown correctly with any Unicode content. The ID3v2 tags may be written, for example, with the Mp3tag application.

(*) File with metadata stored in the RIFF headers using some encoding other than ISO8859-1