Closed skunark closed 10 years ago
I'm fine with this in principle, but Mutagen doesn't currently support AIFF. That will need to be solved first. Also, more fundamentally: is there a standard for metadata in AIFF files? If so, what is it (ID3 or something else)?
It's also worth asking: why store music in AIFF format when lossless compression like FLAC is available?
I believe the AIFF uses ID3vx, but here is a pointer to the standard http://muratnkonar.com/aiff/index.html. I've recently learned that latest iTunes has a bug formatting the file which I was hoping would be the tags and beets would come to the rescue.
As for AIFF over FLAC, I can come up with a few usability reasons for AIFF:
Reasons for FLAC:
Reasons for ALAC:
Thanks for reasoning through that!
That page doesn't make any mention of ID3 or other metadata standards, as far as I can tell. And some cursory Googling suggests that AIFF and WAV don't incorporate any space for metadata information. Do you have software that does keep metadata in AIFFs?
If, as it appears, AIFF does not support metadata, we can consider adding support for tracking this kind of file in the database, even if we can't write any tags in them.
iTunes, VLC, mediainfo and some id3* tools all can read the metadata from AIFF files. As an end-user, I've always thought AIFF supported metadata and was the only advantage over a WAV file.
I think supporting WAV and any format in a database would also be useful for the purists and might have value for those not wanting modify the files after the rip.
Good to know; thanks. This will take some investigation to determine how those other applications store tags in AIFFs.
Support for metadata-less formats could be a separate ticket. (But note that we already support not modifying files -- the import.write config option will let you leave your files pristine.)
On Apr 17, 2013, at 12:03 PM, skunark notifications@github.com wrote:
iTunes, VLC, mediainfo and some id3* tools all can read the metadata from AIFF files. As an end-user, I've always thought AIFF supported metadata and was the only advantage over a WAV file.
I think supporting WAV and any format in a database would also be useful for the purists and might have value for those not wanting modify the files after the rip.
— Reply to this email directly or view it on GitHub.
I was curious myself, so I did some searching around. It seems that AIFF uses ID3, according to Beatport and this. Seems like it's a bit buggy and incomplete. (Various unsupported ID3 tags in AIFF readers that work in mp3s.) No mention on what version of ID3, but I'd assume v2. The ID3 website also includes a blurb from iTunes: http://id3.org/iTunes%20Normalization%20settings
Thanks! Good find. If that's the case, we'll need to (a) figure out how other software is embedding ID3 tags into AIFF files, and (b) add support to Mutagen for doing so.
Not that it's particularly relevant, but just to get on the ol' soapbox for a moment: ID3 is a truly terrible metadata format. Its unnecessary complexity the cause of many headaches in beets (and Mutagen, evidently). We're stuck with it by historical accident, and it would be a shame if we helped perpetuate the format when simpler, faster, more flexible standards exist. So if there's any choice here (i.e., interoperability with other software is not so important), it would be great if we could choose anything other than ID3 to keep supporting.
I'm going to try and do some looking around for this and see if there are any python libraries out there for reading and writing ID3 tags to AIFF.
Unfortunately for DJs I think AIFF is the only lossless format that the Pioneer Rekordbox software will write metadata too. Though I could be wrong. But this is now the one thing holding me back for using beets exclusively for my DJ collection.
Great; thanks for looking into it. FWIW, it would be nice to reuse as much of Mutagen as possible (possibly landing a patch against Mutagen itself) to avoid separate codepaths for ID3 formatting in AIFF and other formats.
Also, it's worth reiterating that ID3-on-AIFF is nonstandard: it's possible that every program that supports this does it in a slightly different way. So if you have them, sample AIFF files with tags added by Rekordbox could help elucidate the situation.
Alright. I've started to dig into this a little bit. I think you're right that it would be a good idea to land a patch against Mutagen itself.
From what I understand, the AIFF audio format uses the Interchange File Format, which is made up of 'chunks' of data. Each chunk has a header identifier (a four character identifier), the size of the chunk, and then the data itself. If I'm not mistaken, chunks are arbitrary, though it seems there are some standard types.
This document has a lot of good information
So I think this is how it works, every AIFF audio file will always start with a FORM
chunk which from what i can tell just contains the AIFF
identifier and then all of the sub chunks.
Here are the sub chunks:
COMM
chunk (common) which is metadata about the audio file. This contains things like number of channels and sample rate.SSND
chunk which contains all of the actual sound data.The important thing to note, is that we can just skip through these chunks by following the skipping forward each size to the next chunk until we get to the ID3 chunk. In here is our ID3 data, which should be able to be parsed just like a MP3s ID3 data.
I found a implementation of the chunk skipping here
I've started looking at the Mutagen code. It's a little bit not what I would have expected though, the logic for getting to the ID3 data is in the ID3 class, instead of being abstracted to each file type.
Here's an example AIFF file
Initial Key : 4B
Genre : Hardcore
Title : Get Ready
Artist : Hoodzie
As for ID3 not being a standard. I can't tell? According to wikipedia the AIFF format is the only format that widely uses ID3 tags. Also the wikipedia page for the AIFF format list the ID3 chunk, so it seems to be common enough?
It does seem like you can store some metadata about the AIFF file in the first COMM
chunk, in fact I did see that in one of my files. But I imagine most programs probably look for the ID3 tags.
Cool! Thanks for investigating. It sounds like a chunk containing ID3 data is the way to go.
In Mutagen, there's an id3 module containing the generic ID3 manipulation stuff and then specific per-format modules (e.g., an MP3 class) that reuse that ID3 stuff. I think a sound strategy would be to implement a new module that, like MP3, inherits from ID3FileType
. I'm not sure how, but we'll then need to tell Mutagen how to find and manipulate the ID3 chunk.
Make sense?
Here is my early attempt at a patch.
There's a problem right now where (on line 163 in the patch) the ID3 propery on the AIFF class isn't getting set to the AIFFID3 class and is instead reverting to the base ID3 class. If I move the AIFFID3 class into the AIFF class it works ok.
I'll tinker with it some more later tonight and try and add test cases and any needed documentation.
Other problems:
delete
works properly letHi again everyone!
I've just finished up adding test cases for my patch against Mutagen. I think it should fully support reading and writing ID3 tags to AIFF files now. Here are all of the cases that it handles:
ID3
header size field and FORM
header size field.ID3
chunk. Again it handles updating the FORM
header size field after inserting the new chunk.Now my question for anyone who might be able to answer (@sampsyo?). What is the proper process for submitting this patch for code review. I'm very used to GitHubs pull requests, but it looks like the project is officially hosted on Google Code (and mirrored on BitBucket). Should I fork the repository on BitBucket and make a pull request there. Or is there something else I can do?
Here is the patch. Hopefully we can find someone to put it where comments can be made line by line.
Looking forward to seeing AIFF support in beets! :smile:
Wow, lots of good stuff here.
Now, a favor :) iTunes oddly populates the ID3 chunk, pun intended, which violates the AIFF spec which requires chunks to alway be even (all according the MPD developers). It would be amazing, if beets would correct this, typically you can see this issue after ripping one or two albums. If the file size is an odd number then you have an ID3 tag issue. Music Player Daemon isn’t a huge fan of AIFF with these odd chunks or ALAC with random glitches between songs that would make any person and house pet cry. I now maintain a second flac music database just for MPD.
I can pay in beer :) (root beer if you are under 21)
Thanks,
Jim
On Apr 24, 2014, at 10:33 PM, Evan Purkhiser notifications@github.com wrote:
Hi again everyone!
I've just finished up adding test cases for my patch against Mutagen. I think it should fully support reading and writing ID3 tags to AIFF files now. Here are all of the cases that it handles:
Reading ID3 tags from the ID3 chunk of course! Updating ID3 tags in the ID3 chunk. It will expand the chunk if nessicary and update the ID3 header size field and FORM header size field. Adding ID3 tags to a AIFF file that has no ID3 chunk. Again it handles updating the FORM header size field after inserting the new chunk. Removing ID3 tags from the AIFF file, done by completely deleting the ID3 chunk. Now my question for anyone who might be able to answer (@sampsyo?). What is the proper process for submitting this patch for code review. I'm very used to GitHubs pull requests, but it looks like the project is officially hosted on Google Code (and mirrored on BitBucket). Should I fork the repository on BitBucket and make a pull request there. Or is there something else I can do?
Here is the patch. Hopefully we can find someone to put it where comments can be made line by line.
Looking forward to seeing AIFF support in beets!
— Reply to this email directly or view it on GitHub.
Do you have an example AIFF file from iTunes that is broken like you describe? That's silly that iTunes doesn't follow the spec since the spec was published by Apple!
Here is the example I provide for MPD, I can provide more if needed. Typically if you use MPD, just ripping one or two albums will do the trick. I’ve since moved on to python audio tools for FLAC duties with AccurateRip checks.
Troubled File: https://docs.google.com/file/d/0Bx20GGdQw_yQb0pZbl9TdDdRU0U/edit?pli=1
MPD bug report: http://bugs.musicpd.org/view.php?id=3743
Jim
On Apr 24, 2014, at 10:51 PM, Evan Purkhiser notifications@github.com wrote:
Do you have an example AIFF file from iTunes that is broken like you describe? That's silly that iTunes doesn't follow the spec since the spec was published by Apple!
— Reply to this email directly or view it on GitHub.
Awesome! Thanks. The file is indeed missing the last pad byte required by the spec.
The good news is that it's missing the byte right at the end of the file if it was missing it before one of the chunks then it would be much more difficult to fix. Assuming iTunes only forgets to pad the byte at the end then it should be trivial to write a single null byte to the end of the file if it's uneven. It could probably even be a simple plugin to for beets.
This would probably make sense to have a new issue opened for it once AIFF support makes it's way into beets =)
This is totally awesome, @EvanPurkhiser! This seems fully functional to me, especially since it's got a substantial and very readable test suite.
We should coordinate with @lazka now. I believe the best way will be to open a pull request on BitBucket, which fortunately works mostly the same as GitHub (fork, push, press the PR button). I'd be happy to help with review and such; hopefully we can wrap this up quickly!
@skunark would you elaborate further about python audio tools for ripping?
Mutagen 1.23 has been released with AIFF support. All that's left is for it to be integrated into beets!
I've also pulled this into MutagenX 1.23 (available on PyPI). Most of the code was already pretty compatible, so nice job on that! Main change was that I added a few functions to validate chunk names a little more strictly.
I would be stoked if beets would import AIFF files.