MusicPlayerDaemon / MPD

Music Player Daemon
https://www.musicpd.org/
GNU General Public License v2.0
2.12k stars 340 forks source link

Multi-value MusicBrainz ID tags in MP3 files are not handled correctly #687

Closed elomatreb closed 4 years ago

elomatreb commented 4 years ago

Bug report

Describe the bug

The MusicBrainz tags containing artist IDs (MBIDs), both for the album and for individual tracks, can contain multiple values. In MP3 files, these are treated incorrectly.

Expected Behavior

In a FLAC file, the tag returns multiple values correctly (partial output of currentsong):

[...]
MUSICBRAINZ_ALBUMARTISTID: 1f9df192-a621-4f54-8850-2c5373b7eac9
MUSICBRAINZ_ALBUMARTISTID: dea28aa9-1086-4ffa-8739-0ccc759de1ce
MUSICBRAINZ_ALBUMARTISTID: d2ced2f1-6b58-47cf-ae87-5943e2ab6d99
[...]

Actual Behavior

In an MP3 file tagged with ID3v2.3, the values are joined with / and treated as one long value:

[...]
MUSICBRAINZ_ALBUMARTISTID: 1f9df192-a621-4f54-8850-2c5373b7eac9/dea28aa9-1086-4ffa-8739-0ccc759de1ce/d2ced2f1-6b58-47cf-ae87-5943e2ab6d99
[...]

In an MP3 file tagged with ID3v2.4, the values are separated with null-bytes, and the tag value is truncated before the first terminator:

[...]
MUSICBRAINZ_ALBUMARTISTID: 1f9df192-a621-4f54-8850-2c5373b7eac9
[...]

This last behavior makes this behavior especially frustrating, because at least the long /-separated value can be parsed on the client side.

Version

Music Player Daemon 0.21.16 (0.21.16)
Copyright 2003-2007 Warren Dukes <warren.dukes@gmail.com>
Copyright 2008-2018 Max Kellermann <max.kellermann@gmail.com>
This is free software; see the source for copying conditions.  There is NO
warranty; not even MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Database plugins:
 simple proxy upnp

Storage plugins:
 local smbclient udisks nfs curl

Neighbor plugins:
 smbclient upnp udisks

Decoders plugins:
 [mad] mp3 mp2
 [mpg123] mp3
 [vorbis] ogg oga
 [oggflac] ogg oga
 [flac] flac
 [opus] opus ogg oga
 [sndfile] wav aiff aif au snd paf iff svx sf voc w64 pvf xi htk caf sd2
 [audiofile] wav au aiff aif
 [dsdiff] dff
 [dsf] dsf
 [hybrid_dsd] m4a
 [faad] aac
 [mpcdec] mpc
 [wavpack] wv
 [modplug] 669 amf ams dbm dfm dsm far it med mdl mod mtm mt2 okt s3m stm ult umx xm
 [mikmod] amf dsm far gdm imf it med mod mtm s3m stm stx ult uni xm
 [wildmidi] mid
 [fluidsynth] mid
 [ffmpeg] 16sv 3g2 3gp 4xm 8svx aa3 aac ac3 adx afc aif aifc aiff al alaw amr anim apc ape asf atrac au aud avi avm2 avs bap bfi c93 cak cin cmv cpk daud dct divx dts dv dvd dxa eac3 film flac flc fli fll flx flv g726 gsm gxf iss m1v m2v m2t m2ts m4a m4b m4v mad mj2 mjpeg mjpg mka mkv mlp mm mmf mov mp+ mp1 mp2 mp3 mp4 mpc mpeg mpg mpga mpp mpu mve mvi mxf nc nsv nut nuv oga ogm ogv ogx oma ogg omg opus psp pva qcp qt r3d ra ram rl2 rm rmvb roq rpl rvc shn smk snd sol son spx str swf tak tgi tgq tgv thp ts tsp tta xa xvid uv uv2 vb vid vob voc vp6 vmd wav webm wma wmv wsaud wsvga wv wve
 [gme] ay gbs gym hes kss nsf nsfe sap spc vgm vgz
 [pcm]

Filters:
 libsamplerate soxr

Tag plugins:
 id3tag

Output plugins:
 shout null fifo pipe alsa ao oss openal solaris pulse jack httpd recorder

Encoder plugins:
 null vorbis opus lame twolame wave flac

Archive plugins:
 [bz2] bz2
 [zzip] zip
 [iso] iso

Input plugins:
 file archive alsa tidal qobuz curl ffmpeg smbclient nfs mms cdio_paranoia

Playlist plugins:
 extm3u m3u pls xspf asx rss soundcloud flac cue embcue

Protocols:
 file:// alsa:// tidal:// qobuz:// http:// https:// gopher:// rtp:// rtsp:// rtmp:// rtmpt:// rtmps:// smb:// nfs:// mms:// mmsh:// mmst:// mmsu:// cdda://

Other features:
 avahi dbus udisks epoll icu inotify ipv6 systemd tcp un

I can provide the sample files I used to generate the above results if necessary, but since they are actual recordings I wanted to avoid uploading them for now.

MaxKellermann commented 4 years ago

A sample file is necessary. Maybe you can copy the ID3 tag to another (empty?) mp3 file - I don't need your music payload, just an arbitrary file with the tag that reproduces the problem.

elomatreb commented 4 years ago

GitHub refuses to let me upload audio files for some reason, so I uploaded to them to my server for now: https://ole.bertr.am/files/demos.tar

MaxKellermann commented 4 years ago

Show me the part of the ID3v2.x specification which describes that multiple values for one TXXX are separated by slash or null byte. I believe your files are bad, and MPD's behavior is correct, until I'm proven wrong.

MaxKellermann commented 4 years ago

See http://id3.org/id3v2.4.0-frames:

4.2. Text information frames \<Header for 'Text information frame', ID: "T000" - "TZZZ", excluding "TXXX" described in 4.2.6.>

Section 4.2 allows multiple values for T frames, but that section explicitly excludes TXXX. Now section 4.2.6 says:

4.2.6. User defined text information frame Value \<text string according to encoding>

Note the singular "Value" and "text string". Only one value allowed according to the specification.

In http://id3.org/id3v2.3.0 only a few text frames (such as TCOM) are allowed to have multiple values, separated with a slash. However, section 4.2.2 says that TXXX has one value.

Unless I'm missing something, your files are really bad. Therefore, I close this bug report, until proven wrong.

elomatreb commented 4 years ago

These files were tagged with Picard, the flagship MusicBrainz tagger.

In any case I can live with things being a single value if the spec forces them to be, but is there a possibility of not cutting off the value on v2.4 files? This would at least allow me to deal with this issue on the client side.

MaxKellermann commented 4 years ago

If you believe Picard is correct, show me the part of the specification - maybe my interpretation was wrong. If you believe Picard is not correct, report the bug to the Picard project. But what does not impress me at all is saying "the flagship this and that". That doesn't matter. What matters is who's buggy - MPD or Picard.

draconx commented 4 years ago

Related Picard ticket: https://tickets.metabrainz.org/browse/PICARD-990

I think it's pretty clear that the default behaviour of Picard is writing multi-valued TXXX tags that are not compliant with the ID3v2.4 spec. But it is not obvious to me how Picard should be changed.

As an MPD and MusicBrainz user I just want to get multiple MUSICBRAINZ_ARTISTID metadata values for a particular track at the MPD protocol level (so that the various database search commands work as expected). Most of my files are flac files and this is easily achieved today with vorbis comments and works very well.

So my question is, how should a user tag their MP3 files so that MPD recognizes multiple artist MBIDs (and multiple work MBIDs)? This does not appear to be possible by any method, which would seem to imply that there is no way this issue can be fixed in Picard (or any other tagger for that matter).

elomatreb commented 4 years ago

I ended up just maintaining a small client-side database of MusicBrainz metadata, populated from the web service using the track/recording ID fields. This makes database updates slow but at least it allows me to work around this.

ijabz commented 4 years ago

Section 4.2 explicitly excludes TXXX because Text Information Frames contain only encoding and value, whereas User Defined Text Information frames contain encoding, description and value so they need their own section.

It is true that in the ID3v24 spec Text Information Frames says

text string(s) according to encoding

and User Defined Test Information Frames say

text string according to encoding

and I must admit I hadnt noticed this subtle difference before. But it also says

There may be more than one "TXXX" frame in each tag, but only one with the same description.

so if you should not have multiple frames with the same name to represent multiple values , then surely use of null value sis the sensible approach as used elsewhere.

But whether you are right or wrong about the authors original intentions this specification is over twenty years old and not being updated, seeing as he went from v22 to v23 and then v24 its reasonable to think he had a few mistakes left in v24.

I don't know of any other tool that allows multiple values in Text frames but not user defined frames -so with this approach you are just creating incompatbility between applications. My own jaudiotagger lib support multiple values, it is quite widely used and noone has ever filed a bug about this against it. I would ask that you add support for this to better support other applications, in turn making your application more useful.