beetbox / mediafile

elegant audio file tagging
http://mediafile.readthedocs.io/
MIT License
97 stars 24 forks source link

Convert mb_artistid and mb_albumartistid to lists #46

Closed JuniorJPDJ closed 3 years ago

JuniorJPDJ commented 3 years ago

Resolves #43.

JuniorJPDJ commented 3 years ago

Rebased on top of master.

sampsyo commented 3 years ago

Alright, so this is the tricky one. On the surface, it's a simple change; unfortunately, the interaction with reality is complicated.

Probably most importantly (and as discussed elsewhere), changing the type of these existing fields would break current software that depends on these fields—most importantly, it would break beets (and beets plugins). This change would be tricky to get right everywhere, especially considering third-party plugins that might rely on the string behavior. Providing a second interface to get list-style access to the same data would be awesome.

Second, there is a broader issue here about how to deal with multi-value tags in MediaFile. It would be really great to come up with some sort of consistent way to expose list-formatted data. However, there are many complicating factors:

So generally speaking, there is a deeper problem to be solved here about how to expose both list-shaped and non-listy versions of data—for different use cases and different situations. I honestly don't know what to do about it. But I confess some discomfort with introducing lists of data without more clarity on what the grand plan is.

JuniorJPDJ commented 3 years ago

Actual problem with those values is that files are already tagged with this fields as multi-valued by most of software and I'm not sure why those are NOT lists at the first place. Musicbrainz itself specifies those tags as multi-valued as you need to write information about all artist IDs to the same tag. mediafile seems to expose only one of all values in lists (first one AFAIR) in case of most of formats and some concatenated strings in case of ID3v2.3 (as shown in #43) - this behaviour is bad and inconsistent.

Most fields can be made into lists for most formats, but not all. So a blanket policy of making things lists would end up with weird exceptions.

Is there any format beyond ID3v<=2.3 that doesn't support those? This was the only format I had problems with when testing this tbh and those two fields have already applied solution for problem with ID3v<=2.3.

Some fields don't feel like they would meaningfully want to be lists, but exactly which is a frustratingly human judgment call. (Should multiple titles be supported? Multiple release dates?)

I don't get what you are talking about now - fields you mentioned are not lists and are not supposed to be lists. Only few of fields are made to be lists and part of those were already implemented before I tried to deal with multi-valued fields ;O

In sake of backwards compatibility I probably could rename those fields to mb_artistids and mb_albumartistids and provide backward-compatible versions of those in similar fashion like genre and genres are created, but IMO it still would be bad (better than present implementation tho).

I'm not sure what will happen if you for example overwrite first value using single-value tag when list would contain more than one value. Will it remove other values? Leave other values untouched and just change first? Second option creates loads of bugs in end software.

But I confess some discomfort with introducing lists of data without more clarity on what the grand plan is.

There's no grand plan ;D I just wanted to fix bug I found as I gonna use mediafile in my software (and probably try to convince Funkwhale devs to use it too).

And yes, this is a bug - mediafile was interpreting files tagged by other software (in case of my tests - Picard) in inconsistent and not intended way, probably also destroying already tagged files (I haven't checked it tho).

EDIT: Just checked if it is destroying tagged files and it's not! It's not that bad! It seems not to touch tags which are not changed by user. I'll probably need to check the same in case of first value from list tag version when implemented.

>>> mf = MediaFile('/home/juniorjpdj/temp/Szpadyzor crew.flac')
>>> mf.mb_artistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.save()
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ARTISTID']
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> mf.mb_artistid = "test"
>>> mf.mb_artistid
'test'
>>> mf.save()
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ARTISTID']
['test']
>>> mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ALBUMARTISTID']
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
JuniorJPDJ commented 3 years ago

I just renamed lists to plural forms and added .single_field() fields on singular fields. It looks like even compatibility with behaviour shown above is kept!

>>> mf.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid = '164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b']

I also rolled back tests as my changes are not needed anymore as main tags work as before.

What do you think about it now?

EDIT: it also works for ID3v2.3 so it fixes weird behaviour:

>>> from mediafile import MediaFile
>>> f = MediaFile('/home/juniorjpdj/temp/ID3v230_Szpadyzor crew.mp3')
>>> f.mb_artistid
'164ed128-525d-4798-883f-be13502de86b'
>>> f.mb_artistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157',
 '680a88be-9f13-4487-add6-e0367841194d',
 'e5f3eaf9-f76b-4fee-ba4f-3208e53587c1']
>>> f.mb_albumartistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
>>> f.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'
>>> f.mb_artistids
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157',
 '680a88be-9f13-4487-add6-e0367841194d',
 'e5f3eaf9-f76b-4fee-ba4f-3208e53587c1']
f.mb_artistid = "kek"
f.mb_artistids
['kek']
JuniorJPDJ commented 3 years ago

.