thomasvs / morituri

For those about to RIP - a Unix CD ripper preferring accuracy over speed
GNU General Public License v3.0
301 stars 42 forks source link

morituri uses wrong metadata #48

Open tuomo opened 11 years ago

tuomo commented 11 years ago

There is a serious error on how morituri fetches the tracklist for a release. I tracked it down to morituri/common/mbngs.py. There are lines like the following:

trackCredit = _Credit(t['recording']['artist-credit'])
track.title = t['recording']['title']

Like the string 'recording' suggests, morituri seems to use the metadata of the associated recordings, not the tracklist itself. Those are often different. Indeed, in case of classical music, the recording artist is supposed to be different from the track artist, see https://musicbrainz.org/doc/Style/Classical/Recording/Artist.

In cases where the recording title differs from the track title, the correct track title seems to be available as t['title']. I don't immediately see, how to get the track artist easily.

I first wanted to fix this myself, but now I'm giving up for now as this whole metadata lookup thing seems to be a bit of a mess: python-musicbrainz2 is listed as a requirement, but as far as I see it's not used; instead morituri is shipped with an old version of python-musicbrainz-ngs that it uses. Or did I get something wrong?

Seriously, am I the first one to try to use morituri to rip classical music? How can a bug like this go unnoticed for such a long time?

Freso commented 10 years ago

There's also codebits like (from _Credit):

    def getName(self):
        return self.joiner(lambda i: i.get('artist').get('name', None))

Which circumvents the already prepared release['artist-credit-phrase'] and release['medium-list']['track-list'][...]['recording']['artist-credit-phrase'] readily available. I'm not sure whether the latter are sourced from the recording or the medium's tracklist though. The tracklist itself, result['medium-list'][...]['track-list'] does not contain any artist information, but might be where the track titles should be sourced from.

thomasvs commented 10 years ago

I personally don't use morituri on classical music, so yes, it's possible that there are bugs in here for classical music.

Some guidance in expected behaviour (bonus points if expressed through currently failing unit tests for me to fix) would be very much appreciated.

lipidity commented 10 years ago

A single recording can appear in multiple 'releases' under different names (eg, in different languages). Morituri should use the name of the track (which is specified per 'release') rather than the name of the recording (which is the same everywhere).

I will investigate this more and see if I can provide some code.

(@tuomo please correct me if I've misunderstood)

lipidity commented 10 years ago

I added a unit test here: https://github.com/thomasvs/morituri/pull/58 I think in the artist-credit structures, 'name' should be used, with 'artist.name' only as a fallback. Likewise for track titles, 'title' should be used if present, with 'recording.title' only as a fallback. This should be a good start.