beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.7k stars 1.81k forks source link

fetchart: Use MusicBrainz' Bandcamp/Discogs/... association to fetch art #4707

Closed JOJ0 closed 1 year ago

JOJ0 commented 1 year ago

Discussed in https://github.com/beetbox/beets/discussions/4676

Originally posted by **evur** February 12, 2023 There are instances where covers are only available from one of these and not from the currently implemented sources. The Discogs ID can be retrieved from the MusicBrainz data. This allows for finding a cover precisely should Discogs have it and MusicBrainz (via CoverArtArchive) not.
sampsyo commented 1 year ago

Hi! I think we already have Discogs covered in #429. And just to be technical about it, MusicBrainz itself doesn't have cover art—it hosts stuff at CoverArtArchive, which fetchart does already support. (I think, following the discussion from https://github.com/beetbox/beets/discussions/4676, that the goal may be to use MB's Discogs association to fetch art?)

arsaboo commented 1 year ago

There are other sources that provide cover art - Spotify and JioSaavn (a plugin that I added). Both of them provide the image url for the cover art that can be easily retrieved.

sampsyo commented 1 year ago

I guess the question is: is the idea that the cover art would be retrieved without the fetchart plugin? Or are we talking about simply adding additional data sources to fetchart itself? Or a third option I haven't considered?

arsaboo commented 1 year ago

At the very least, we should be able to pass an image URL (including from any user defined plugin) that embedart can use to retrieve and embed cover art. For official plugins (e.g., Spotify), it is fine to add the data source to fetchart itself.

BTW, I do have a related open PR that allows passing an image URL to the embedart plugin. It may not address the problem, but will be a step in that direction.

JOJ0 commented 1 year ago

I tried to clarify the title of the issue and improved the description (CoverArtArchive).

Does this reflect what the actual feature request is? @snejus @arsaboo ?

arsaboo commented 1 year ago

Actually, it should be applicable to any service that provides album art. For example, Spotify also provides album art. I am assuming once we implement it for one service, it can be extended to others.

snejus commented 1 year ago

From what I gathered, there seems to be two main ideas floating around

1. Fetching art in a more datasource-agnostic manner

For example, we use album data from MusicBrainz but we are able to fallback to other data sources when MusicBrainz does not provide cover data.

I think this is a great idea. I've experimented with something similar in this dev branch of beetcamp, where I attempted to query discogs for some of the data missing on bandcamp.

I've been thinking about this from a more general perspective - as a user, I would love the ability to combine data coming from various sources instead of having to pick just one of them.

2. Either make fetchart easier to extend dynamically or consider a cover art URL coming from an external data source a legitimate cover art source

I wish I was aware of this issue less 😁 beetcamp comes with a cover art source, however it relies on some nasty monkey-patching in order to work

class BandcampPlugin(...):
    ...
    def loaded(self) -> None:
        """Add our own artsource to the fetchart plugin."""
        for plugin in plugins.find_plugins():
            if isinstance(plugin, fetchart.FetchArtPlugin):
                fetchart.ART_SOURCES[self.data_source] = BandcampAlbumArt
                fetchart.SOURCE_NAMES[BandcampAlbumArt] = self.data_source
                fetchart.SOURCES_ALL.append(self.data_source)
                bandcamp_fetchart = BandcampAlbumArt(self._log, self.config)
                plugin.sources = [bandcamp_fetchart, *plugin.sources]
                break

Ideally, I'd want to drop BandcampAlbumArt, and replace it with a field in the album data, - say cover_art_url, which may get inspected by fetchart which then decides what to do about it.

sampsyo commented 1 year ago

Thanks for expanding on this a little bit!

The way that fetchart currently works is that it also has several backends, and it tries each one (in order), taking the first "valid" image that one is able to supply. To state the perhaps-obvious, I can see three ways to get cover images from Bandcamp:

  1. The Bandcamp autotagger source plugin retrieves the image—and fetchart isn't involved at all.
  2. We add a new Bandcamp source to the fetchart plugin. This works by looking to see if there is a Bandcamp album ID associated with the newly-imported album and then, if so, looking up the associated image from the Bandcamp API. Critically, the Bandcamp plugin can provide this ID, but it need not actually be involved at all: things would still work if the user did something like beet modify bandcamp_album_id=foo followed by beet fetchart.
  3. We somehow provide an import-pipeline-only way for the Bandcamp plugin to communicate directly with the fetchart plugin. That is, the Bandcamp plugin determines a candidate image URL, which it then supplies to the fetchart plugin, which can only see this information during the import process. There is no way to do the same thing later with a beet fetchart command. This would require building some new way for import pipeline stages to communicate ephemeral data with each other, apart from the metadata on music files themselves.

I'd be interested to hear others' thoughts on the trade-offs here. I kinda think the most natural option is 2 above: just add a new Bandcamp backend to the fetchart plugin, possibly informed by the IDs that the Bandcamp source plugin provides. The main drawback here would be that the fetchart plugin may need to duplicate some basic API communication stuff that the Bandcamp plugin also includes. But it's not clear to me yet how extensive that duplication would be.

JOJ0 commented 1 year ago

I like option 2 :-)

arsaboo commented 1 year ago

The problem with option 2 is that we will have to add that for every source. For official plugins, Option 2 is fine, but we need a generic approach. I like the cover_art_url approach suggested by @snejus earlier.

sampsyo commented 1 year ago

I could be wrong, but I think the cover_art_url option is an instance of Option 3 in my list? If I'm understanding it correctly, it would have the downside of being an import-only thing. Which could be OK—I just want to make it clear that this would depend on happening during the import process and wouldn't work "after the fact." (Unless I'm missing something.)

snejus commented 1 year ago

@sampsyo what about storing cover_art_url as a flexible attribute help making this more 'permanent'? I'd imagine fetchart would be able to query this field at any point, not just the import stage?

This would also nicely cover the issue regarding album vs items cover photos (for example, on Bandcamp in rare cases an individual song within an album may have a different cover photo from the album). We may consider allowing the user to choose whether to apply the album or the item cover art, if it exists?

sampsyo commented 1 year ago

Sure; I can see that working. The "communication channel" described in Option 3 would become that flexible attribute.

This does create a somewhat weird possibility where the cover_art_url becomes out of date: e.g., someone re-imports the album with a different data source that doesn't provide a cover art URL and now the metadata disagrees with the image. But perhaps that is too niche of a problem to care about!

snejus commented 1 year ago

Good point. I feel like consistency is very important here, and thus we would ideally have a single source of truth. How complicated would it be to extend fetchart to update such field in those cases when traditional sources are being used?

arsaboo commented 1 year ago

@sampsyo Even if the user reimports with a different source, the cover art will not be wrong (if it is not updated). Users can always update the cover_art_url field. We just need an update function to refresh the albumart using the cover_art_url field.

arsaboo commented 1 year ago

Shouldn't it just be another function (say fetch_cover_art_url) here?

arsaboo commented 1 year ago

It turned out to be a lot easier than I thought. @sampsyo @JOJ0 @snejus can you please check #4778 that adds the ability to use cover_art_url flex attr as an album art source. Any plugin can populate that field. I tested it out with my JioSaavn plugin and it works perfectly.