beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.69k stars 1.81k forks source link

Split track metadata from file metadata #1640

Open pprkut opened 8 years ago

pprkut commented 8 years ago

Currently beets treats every track as exactly one physical file. That is, both sets of information end up in one table in the database and are used combined in the code (more or less). Examples of track metadata include artist name, track title, album name, musicbrainz ids, etc, whereas examples for file metadata include format, sample rate, location, etc.

From my perspective it would be beneficial for a number of features to split file metadata from track metadata, but at the core it all revolves around "duplicates".

Not every duplicate is a duplicate. Maybe someone wants to keep both Vorbis/MP3/AAC and FLAC versions of releases in his collection. That means we have different file metadata for every file (obviously), but the track metadata is the same. Keeping it separate could mean we only need to fetch remote metadata (echonest, musicbrainz, etc) once, and writing it to multiple files. (I think currently we fetch it for every track, even if we've already fetched it before. I might be wrong though). It also means more logical name collision handling. Right now if I have a FLAC and a Vorbis incarnation of a release in my collection, album disambiguation kicks in as it sees them as potentially conflicting, although there really shouldn't be (unless I'm missing something).

When having multiple versions of a track available, it would be really neat to then play with "suitability", i.e. creating playlists based on what suits the intended use case best. Like for example prefering FLAC over lossy when playing audio on the local computer, but prefering lossy over FLAC when streaming, or disregarding FLAC completely when copying to an MP3 player that doesn't support it. I'm sure some people can come up with other use cases :-)

There's also a nice side effect for potential future video support. Sometimes it's handy to keep not only the video file itself, but also an audio version of the track at hand. Again the track metadata doesn't change, but splitting off the file metadata would eventually allow us to keep a video file next to an audio file, and you could again play around with when to prefer which incarnation of the track.

sampsyo commented 8 years ago

Interesting! The complement to this would be allowing multiple tracks corresponding to the same file, as in #136.

This would be fun to explore. It would be a big change, though, that would have to start with rethinking the architecture.

twrightsman commented 7 years ago

It may be useful to follow the architecture of MusicBrainz since they are already doing the hard conceptual thinking on how to best represent musical metadata. The only thing beets would have to on top of that is manage the actual music files.

sampsyo commented 7 years ago

Hi, @twrightsman—what specifically would you want to adopt from the MB data model?

twrightsman commented 7 years ago

I actually misspoke earlier: beets already does a great job making Items synonymous with recordings and Albums synonymous with releases. My suggestion is that beets has to take the MB data model one step further since it is actually managing the audio data in addition to the metadata. To propose an implementation like @pprkut suggested, the Item object would have to be decoupled from File objects, for example. A given Item object can have many File objects associated with it and these File objects are what store data about the file format, location, bit rate, etc.

Now, the question is how would this work with existing commands/plugins? Unfortunately, I feel like this would have to be addressed per command/plugin because what was only one object before (Item) is now conceptually two (Item or File). For example, beet ls would only need to list Items and not necessary all of their File objects (although a user might want a flag to list all File objects tied to each track). beet ls -p would need to either have configuration to favor a specific format to enforce that all Items only have one path printed or maybe just print the paths for all File objects in the database. Does this sound reasonable so far?

sampsyo commented 7 years ago

Got it; thanks! Yes, you are definitely on the right track. In my view, the main challenge here is keeping things simple: namely, avoiding the complexity of a file/track separation when it's not necessary.

pprkut commented 7 years ago

@sampsyo and me discussed the implementation of this briefly on IRC a while back. I still have the notes and general plan here, just needed to get some other things sorted out first before I could tackle this effectivly. Almost there though :) The idea was to split this up into smaller steps. First extending the documentation we have on the database side of things so changes in general on that part can be easier. Next was implementing the item/file split on the storage layer only, without exposing the switch to the user yet. That way we already store file information separately, but not allowing multiple files per item yet. That would then be the last step, although there's probably many places to touch here so probably this last step is going to be split up further as well.

xthursdayx commented 5 years ago

@pprkut I know that this issue is now listed as a closed, but I was wondering if you'd made any progress address this issue? Specifically in relation to keeping multiple formats of the same track in a library (e.g. MP3 and FLAC). Thanks!

pprkut commented 5 years ago

As far as I can see the issue is still open :-)

I didn't make any progress on this, unfortunately, as performance work was more important on my side for now. But it is very high up on my priority list of features to work on. I'll get to this eventually, unless someone else beets (sorry, couldn't resist ;-) ) me to it.

xthursdayx commented 5 years ago

As far as I can see the issue is still open :-)

Oops, I was looking at that "Closed" flag on the issue #3036 :-|

Thanks for the update; I wish I knew Python well enough to be any help!

exislow commented 2 years ago

What is the status about this issue? Is this on the roadmap?