naggie / crates

2013: Crates, a media database with immutability, federation and filesystem mapping, playlists for DJing
MIT License
5 stars 2 forks source link

How do we deal with broken albums??? #17

Closed naggie closed 9 years ago

naggie commented 9 years ago

"Fast jungle music" from Hospital (no less) identifies as about 15 separate albums. It appears the album artist field is not used.

iTunes has the same problem!

Really, Hospital Records should get their act together; but I suspect we'll see this again.

@jimjibone any ideas?

Some of mine:

  1. Study headers and come up with a better alternative subset
  2. Use a hash of the cover art (won't work for all compilations)
naggie commented 9 years ago

The issue: defaulting to "Artist" for "Album artist" if "Album artist" was not present. I suspect this would cause a problem if 2 albums have the same name without "Album artist" set.

For now I've removed the "Artist" default -- it's now "Various artists"

A possible solution is to hash the cover art if the "Album artist" is not set and use that to sort. Not sure how I will do that yet.

naggie commented 9 years ago

It looks like we'll have to do a few tricks in order to solve this problem.

As far as I can see:

  1. Store bought, one-off or otherwise, the data contained within the headers is no coherent enough to us -- which is why MusicBrainz & AcoustID exist. However, MusicBrainz won't work for unlabelled music al the time.
  2. Single AudioFiles are consistent enough for a search based browsing (vosbox did it well)

Aside from manual curation (which we want to minimise) I think that we should index Audiofiles as-is (we can replace/deprecate to improve) and then derive a separate album (etc) from this.

Thoughts?

naggie commented 9 years ago

Oh, and these "Broken albums" tend to be compilations of various artists -- some set the album artist to the individual artist.

naggie commented 9 years ago

Going to make a classmethod on Album that tries to match or create an album:

Try desperately to match a current album, and as a last resort create a new one by:

  1. Try to match by MBID if it's there (probably isn't, yet)
  2. Looking for exact match of Album/Album artist if there is one (replace with better cover art if possible)
  3. Try matching by cover art hash if there is one (and replace "Album artist" with Various artists)
  4. Finally, make a new one.
  5. None

Also, possible filter on the Album name: Using a regex, remove substrings like:

....that's assuming it's not useful to split by physical disk. It might be useful.

naggie commented 9 years ago

Implemented.


Ha! It worked