beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.76k stars 1.82k forks source link

Discogs plugin replacing Feat. or Ft. with a comma #4401

Closed kennyboy1978 closed 1 year ago

kennyboy1978 commented 2 years ago

I try and use the Discogs plugin for my collection as I think it is more accurate than MusicBrainz. I've recently noticed that if Feat., Ft. or Vs. appears in the artist name on Discogs, it is replaced by a comma in between artists.

https://www.discogs.com/release/21846394-Vinylgroover-Feat-MC-Freestyle-Bassface

The output I want is:

Vinylgroover Feat. Dionne - Got Me Burning

But instead it's written as:

Vinylgroover, Dionne - Got Me Burning

Similarly:

https://www.discogs.com/release/22025074-Neophyte-Ft-Alee-Original-Gangster

Neophyte Ft. Alee - Original Gangster

Is written as:

Neophyte, Alee - Original Gangster

https://www.discogs.com/release/1694463-Scott-Brown-Vs-DJ-Rab-S-Now-Is-The-Time

Scott Brown Vs DJ Rab S - Now Is The Time

Scott Brown, DJ Rab S - Now Is The Time

Using Musicbrainz however, writes the format that I prefer. This would point to the way the plugin pulls the information, rather than something in my config file?

RollingStar commented 2 years ago

How I would diagnose this:

I know plugins are stored in beetsplug:

https://github.com/beetbox/beets/tree/master/beetsplug

I see discogs.py ( I can also search for 'discogs' if i dont know the name of the plugin)

https://github.com/beetbox/beets/blob/master/beetsplug/discogs.py

I search "feat" - no results "ft" - nothing

My next step would be to see how the discogs plugin is getting info from discogs. possibly with an "API" call. for musicbrainz (For example) you can look at what the program is seeing- ex.

https://musicbrainz.org/doc/MusicBrainz_API/Examples

edit:

Looks like it uses discogs_client, an outside library. That's as far as I'm taking this but others are welcome to continue the research.

https://github.com/beetbox/beets/blob/472c3ab7975e5933f8111a2182f177a0fe5642d5/beetsplug/discogs.py#L286

You could also try increasing the logging level of python/beets/discogs client to the level "debug". I don't know offhand how to do that.

sampsyo commented 2 years ago

This Discogs API, at least in the way we're using it, seems to return a list of artist names (e.g., Vinylgroover and Dionne, separately instead of a single string Vinylgroover Feat. Dionne). They are joined together into a single string here: https://github.com/beetbox/beets/blob/472c3ab7975e5933f8111a2182f177a0fe5642d5/beetsplug/discogs.py#L568-L570

It's possible Discogs also reports the single-string version of multi-artist music. If so, then we should probably use that! But it would require verifying that this information is actually available somewhere via the API…

stale[bot] commented 2 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

JOJ0 commented 2 years ago

If we look at the plain api response of aforementioned release: https://api.discogs.com/releases/21846394, we see that the first artist contains "join": "Feat", the second artist does not contain that:

"artists": [
    {"name": "Vinylgroover", "anv": "", "join": "Feat", "role": "", "tracks": "", "id": 11433, "resource_url": "https://api.discogs.com/artists/11433"},
    {"name": "MC Freestyle", "anv": "", "join": "", "role": "", "tracks": "", "id": 336113, "resource_url": "https://api.discogs.com/artists/336113"}
],

In python3-discogs-client we could access the "join" information like this already and make use of it in beets:

>>> import discogs_client; d = discogs_client.Client('my_user_agent/1.0', user_token='REDACTED')
>>> release = d.release(21846394)
>>> print(release.artists[0].data["join"])
Feat

We also see that there is a separate object in the response that has it joined together already:

"artists_sort": "Vinylgroover Feat MC Freestyle",

I openend a pull request to implement this into python3-discogs-client: https://github.com/joalla/discogs_client/pull/117, once merged we could access it like this:

>>> import discogs_client; d = discogs_client.Client('my_user_agent/1.0', user_token='REDACTED')
>>> release = d.release(21846394)
>>> print(release.artists_sort)
Vinylgroover Feat MC Freestyle
>>> 
JOJ0 commented 2 years ago

@sampsyo do you agree that the best option would be to use the new p3dc attribute artists_sort and let the beets user configure whether artist names should simply be joined using , as it is now, or the artits_sort field should be used. This would also require to increase the version dependance to p3dc in beets' setup.py (I'm sure we'll merge and release this tiny new feature in a jiff). I see beets uses extras_require in setup.py but without a version of python3-discogs-client. In any case this requirement should be noted in beets docs to "Using Discogs as the source plugin"

The alternative would be to use the "join" info from the artist data already as described above.

I would open a beets PR once we decide what's the best way to proceed. Thanks for your input in advance!

sampsyo commented 1 year ago

Aha, good question! I like the idea of using the join attributes that are available (even if that requires a version bump in the dependency). artist_sort is close to right, but I imagine it does other stuff to artist names to make them sortable—such as "Beatles, The".

JOJ0 commented 1 year ago

Ah good thinking, thanks! I double checked and yes indeed "Beatles, The" is what's containend in artists_sort for a good ol' Beatles release. Alright so then it's settled, I'll use the join attribute and note the version bump in the docs and in setup.py.

JOJ0 commented 1 year ago

Fun fact @sampsyo, the original PR implementing a first version of Discogs source plugin, already respected the join field in the Discogs API result:

https://github.com/beetbox/beets/pull/283/files#diff-c9f257e23ad1ebab4dafc681c7e0a22d3f7432eb8664e06cd7cf84ce5fc426d9R114-R115

At some point the get_artist method was moved out into a staticmethod in the MetadataSourcePlugin abstract class, which doesn't have this functionality anymore - it always returns a comma separated string of artists. I suppose that applies to other MetadataSourcePlugins as well, eg. I tested with the Spotify plugin: When selecting a release found on there, it faces the same "problem": Artists are combinend into a string like eg. "Artist1, Artist2". BTW FYI @arsaboo

I'm wondering if this functionality should be added to this "core" method again or if it is very Discogs specific and should be worked around in the Discogs plugin itself? An optional argument perhaps called 'join_key` could be added:

https://github.com/beetbox/beets/blob/472c3ab7975e5933f8111a2182f177a0fe5642d5/beets/plugins.py#L658-L691

Update: This is where respecting join got lost back when MetadataSourcePlugin was introduced and Spotify, Deezer and Discogs plugins where refactored, have a look at the diff of discogs.py :-) https://github.com/beetbox/beets/pull/3371/files

JOJ0 commented 1 year ago

The alternative, in case we'd like to keep MetadataSourcePlugin.get_artist() as-is, would be to do something like this within the Discogs plugin:

        # Why not use artists_sort, get_artist handles moving "the" to front!
        artist, artist_id = MetadataSourcePlugin.get_artist(
            [{'name': result.artists_sort, 'id': result.artists[0].id}]
        )

But still I think, since stuff was moved out of plugins and generalized and things where forgotten, implementing into the "core" logic seems to be a good idea.

JOJ0 commented 1 year ago

Example import with last "workaround" solution:

(beets)    ~/git/beets   fix_discogs_multi_artist ●  beet import ~/Sync/beets_test_music/DJ\ Marky\ -\ LK

/home/jojo/Sync/beets_test_music/DJ Marky - LK (2 items)
Finding tags for album "DJ Marky & XRS, Stamina MC - LK 'Carolina Carol Bela'".
Candidates:
1. DJ Marky & XRS Featuring Stamina MC - LK 'Carolina Carol Bela' (71.8%) (id, artist, tracks) (Discogs, Vinyl, 2002, UK, V Recordings, V035)
...
...

You see this was previously tagged with the old behaviour - artists simply connected with ", " - and now is a single string.

RollingStar commented 1 year ago

Note: Musicbrainz supplies join keys iirc. Could be a useful crossreference on a search.

JOJ0 commented 1 year ago

By any chance @RollingStar you a maintainer and want to review my PR?

RollingStar commented 1 year ago

Sorry, not this time. I end up playing in people's sandboxes but I don't know how all the beets code works.

JOJ0 commented 1 year ago

@kennyboy1978 by any chance is it possible for you to test the new feature in the PR you find linked above. It might help getting this fix merged. I can help with installing beets from that branch, in case that's desired. Hope that helps!

kennyboy1978 commented 1 year ago

Sorry for not replying sooner.

I stumbled my way through installation and this works as expected.

Artists with "Feat, Ft, Vs and & " are all picked up now.

Thank you very much!