beetbox / mediafile

elegant audio file tagging
http://mediafile.readthedocs.io/
MIT License
100 stars 25 forks source link

mb_artistid and mb_albumartistid should be lists #43

Closed JuniorJPDJ closed 3 years ago

JuniorJPDJ commented 3 years ago

Hey! When trying to fix my PR (#42) I found error I can't fix myself.

mb_artistid and mb_albumartistid should be lists! mediafile's version:

>>> mf = MediaFile('/home/juniorjpdj/temp/Szpadyzor crew.flac')
>>> mf.mb_artistid
'164ed128-525d-4798-883f-be13502de86b'
>>> mf.mb_albumartistid
'164ed128-525d-4798-883f-be13502de86b'

Picard is tagging it like that (mutagen dump): FLAC: 'musicbrainz_artistid': ['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157'], MP3 (ID3v2.3): 'TXXX:MusicBrainz Artist Id': TXXX(encoding=<Encoding.UTF16: 1>, desc='MusicBrainz Artist Id', text=['ca780457-5bf6-4a7d-95fa-0d0d4920f26f/258a67d1-172d-4d16-8d6c-09efde60896f/af262d86-542d-4864-8527-083065166d6c']),

Fixing FLAC parsing is easy I can't make it work in MP3 case.

Current code is returning only the first author in case of FLAC and '/' separated string in case of MP3.

Helping fixing this will also help me solve some problems with #42 ;D

JuniorJPDJ commented 3 years ago
mutagen.File('/home/juniorjpdj/temp/Szpadyzor crew.flac')['MUSICBRAINZ_ARTISTID']
Out[39]: 
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
mutagen.File('/home/juniorjpdj/temp/ID3v240_Szpadyzor crew.mp3')['TXXX:MusicBrainz Album Artist Id']
Out[40]: TXXX(encoding=<Encoding.UTF8: 3>, desc='MusicBrainz Album Artist Id', text=['164ed128-525d-4798-883f-be13502de86b', 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157'])
mutagen.File('/home/juniorjpdj/temp/ID3v230_Szpadyzor crew.mp3')['TXXX:MusicBrainz Album Artist Id']
Out[41]: TXXX(encoding=<Encoding.UTF16: 1>, desc='MusicBrainz Album Artist Id', text=['164ed128-525d-4798-883f-be13502de86b/ded6b91b-26d8-4bc0-8aa9-fab9b54cd157'])

Main problem is with ID3v2.3 separator not being parsed by mutagen?

JuniorJPDJ commented 3 years ago

I successfully got results like this with own class for DescList ID3 on MP3:

from mediafile import MediaFile
MediaFile('/home/juniorjpdj/temp/ID3v240_Szpadyzor crew.mp3').mb_albumartistid
Out[3]: 
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
MediaFile('/home/juniorjpdj/temp/ID3v230_Szpadyzor crew.mp3').mb_albumartistid
Out[4]: ['164ed128-525d-4798-883f-be13502de86b/ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']
MediaFile('/home/juniorjpdj/temp/Szpadyzor crew.flac').mb_albumartistid
Out[5]: 
['164ed128-525d-4798-883f-be13502de86b',
 'ded6b91b-26d8-4bc0-8aa9-fab9b54cd157']

but I still don't know how to handle separator in ID3v2.3 ;/

sampsyo commented 3 years ago

Ah, interesting! Is there any chance that we just need to do the splitting ourselves for ID3v2.3? Or else we could just give up—and say that multiple items are not supported pre-2.4?

JuniorJPDJ commented 3 years ago

For this fields splitting shouldn't be worst idea actually, data in array elements shouldn't contain any /. It can't be used as general solution tho as other fields could contain / in element. Eg. artists field could contain 'AC/DC' which would be handled as 2 separate artists if splitting was made.

We could add some magic bool argument to MP3ListDescStorageStyle from #42 which would decide if we should allow splitting this particular field or not. What do you think about this idea? Of course splitting should be enabled only on ID3v2.3, not on later, not on older tags.

I'm just still wondering if there's a way to make mutagen do it for us. Is #21 suggesting it's possible?

sampsyo commented 3 years ago

Indeed—it would only be appropriate to this hack "on our end" for these ID fields, and not in general.

I took a look at #21 and, unless I'm misreading something, it sounds like the Mutagen docs say that v23_sep only affects saving, not reading… although I could be misunderstanding. It might be good to grep through Mutagen for v23_sep to see if that is in fact the case.

JuniorJPDJ commented 3 years ago

You seem to be right. I'll do this in #42 then ;D

JuniorJPDJ commented 3 years ago

I'll just hardcode / separator ATM but it should be changed when handling #21

JuniorJPDJ commented 3 years ago

It's gonna get it's own PR - I removed fix from #42

JuniorJPDJ commented 3 years ago

The IDs-as-lists change could be problematic, however, and needs some careful consideration.

Well, ATM it's anyway returning only first id from list eg. when tagged by Picard which is badly broken behaviour.

Maybe we could go with something similar to genre and genres? It seems workaroundish to me, but I get your point with backwards compatibility.

However mediafile is still not v1.0 tho so maybe it should be changed now and not later? ;p If you are following semver spec - you are allowed to do any API changes in minor releases prior to v1.0 anyway.

JuniorJPDJ commented 3 years ago

Rebased PR is ready to review.