beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.9k stars 1.82k forks source link

AIFF support #250

Closed skunark closed 10 years ago

skunark commented 11 years ago

I would be stoked if beets would import AIFF files.

sampsyo commented 11 years ago

I'm fine with this in principle, but Mutagen doesn't currently support AIFF. That will need to be solved first. Also, more fundamentally: is there a standard for metadata in AIFF files? If so, what is it (ID3 or something else)?

It's also worth asking: why store music in AIFF format when lossless compression like FLAC is available?

skunark commented 11 years ago

I believe the AIFF uses ID3vx, but here is a pointer to the standard http://muratnkonar.com/aiff/index.html. I've recently learned that latest iTunes has a bug formatting the file which I was hoping would be the tags and beets would come to the rescue.

As for AIFF over FLAC, I can come up with a few usability reasons for AIFF:

Reasons for FLAC:

Reasons for ALAC:

sampsyo commented 11 years ago

Thanks for reasoning through that!

That page doesn't make any mention of ID3 or other metadata standards, as far as I can tell. And some cursory Googling suggests that AIFF and WAV don't incorporate any space for metadata information. Do you have software that does keep metadata in AIFFs?

If, as it appears, AIFF does not support metadata, we can consider adding support for tracking this kind of file in the database, even if we can't write any tags in them.

skunark commented 11 years ago

iTunes, VLC, mediainfo and some id3* tools all can read the metadata from AIFF files. As an end-user, I've always thought AIFF supported metadata and was the only advantage over a WAV file.

I think supporting WAV and any format in a database would also be useful for the purists and might have value for those not wanting modify the files after the rip.

sampsyo commented 11 years ago

Good to know; thanks. This will take some investigation to determine how those other applications store tags in AIFFs.

Support for metadata-less formats could be a separate ticket. (But note that we already support not modifying files -- the import.write config option will let you leave your files pristine.)

On Apr 17, 2013, at 12:03 PM, skunark notifications@github.com wrote:

iTunes, VLC, mediainfo and some id3* tools all can read the metadata from AIFF files. As an end-user, I've always thought AIFF supported metadata and was the only advantage over a WAV file.

I think supporting WAV and any format in a database would also be useful for the purists and might have value for those not wanting modify the files after the rip.

— Reply to this email directly or view it on GitHub.

nogweii commented 11 years ago

I was curious myself, so I did some searching around. It seems that AIFF uses ID3, according to Beatport and this. Seems like it's a bit buggy and incomplete. (Various unsupported ID3 tags in AIFF readers that work in mp3s.) No mention on what version of ID3, but I'd assume v2. The ID3 website also includes a blurb from iTunes: http://id3.org/iTunes%20Normalization%20settings

sampsyo commented 11 years ago

Thanks! Good find. If that's the case, we'll need to (a) figure out how other software is embedding ID3 tags into AIFF files, and (b) add support to Mutagen for doing so.

Not that it's particularly relevant, but just to get on the ol' soapbox for a moment: ID3 is a truly terrible metadata format. Its unnecessary complexity the cause of many headaches in beets (and Mutagen, evidently). We're stuck with it by historical accident, and it would be a shame if we helped perpetuate the format when simpler, faster, more flexible standards exist. So if there's any choice here (i.e., interoperability with other software is not so important), it would be great if we could choose anything other than ID3 to keep supporting.

evanpurkhiser commented 10 years ago

I'm going to try and do some looking around for this and see if there are any python libraries out there for reading and writing ID3 tags to AIFF.

Unfortunately for DJs I think AIFF is the only lossless format that the Pioneer Rekordbox software will write metadata too. Though I could be wrong. But this is now the one thing holding me back for using beets exclusively for my DJ collection.

sampsyo commented 10 years ago

Great; thanks for looking into it. FWIW, it would be nice to reuse as much of Mutagen as possible (possibly landing a patch against Mutagen itself) to avoid separate codepaths for ID3 formatting in AIFF and other formats.

Also, it's worth reiterating that ID3-on-AIFF is nonstandard: it's possible that every program that supports this does it in a slightly different way. So if you have them, sample AIFF files with tags added by Rekordbox could help elucidate the situation.

evanpurkhiser commented 10 years ago

Alright. I've started to dig into this a little bit. I think you're right that it would be a good idea to land a patch against Mutagen itself.

From what I understand, the AIFF audio format uses the Interchange File Format, which is made up of 'chunks' of data. Each chunk has a header identifier (a four character identifier), the size of the chunk, and then the data itself. If I'm not mistaken, chunks are arbitrary, though it seems there are some standard types.

This document has a lot of good information

So I think this is how it works, every AIFF audio file will always start with a FORM chunk which from what i can tell just contains the AIFF identifier and then all of the sub chunks.

Here are the sub chunks:

The important thing to note, is that we can just skip through these chunks by following the skipping forward each size to the next chunk until we get to the ID3 chunk. In here is our ID3 data, which should be able to be parsed just like a MP3s ID3 data.

I found a implementation of the chunk skipping here

I've started looking at the Mutagen code. It's a little bit not what I would have expected though, the logic for getting to the ID3 data is in the ID3 class, instead of being abstracted to each file type.

Here's an example AIFF file

Initial Key      : 4B
Genre            : Hardcore
Title            : Get Ready
Artist           : Hoodzie

As for ID3 not being a standard. I can't tell? According to wikipedia the AIFF format is the only format that widely uses ID3 tags. Also the wikipedia page for the AIFF format list the ID3 chunk, so it seems to be common enough?

It does seem like you can store some metadata about the AIFF file in the first COMM chunk, in fact I did see that in one of my files. But I imagine most programs probably look for the ID3 tags.

sampsyo commented 10 years ago

Cool! Thanks for investigating. It sounds like a chunk containing ID3 data is the way to go.

In Mutagen, there's an id3 module containing the generic ID3 manipulation stuff and then specific per-format modules (e.g., an MP3 class) that reuse that ID3 stuff. I think a sound strategy would be to implement a new module that, like MP3, inherits from ID3FileType. I'm not sure how, but we'll then need to tell Mutagen how to find and manipulate the ID3 chunk.

Make sense?

evanpurkhiser commented 10 years ago

Here is my early attempt at a patch.

There's a problem right now where (on line 163 in the patch) the ID3 propery on the AIFF class isn't getting set to the AIFFID3 class and is instead reverting to the base ID3 class. If I move the AIFFID3 class into the AIFF class it works ok.

I'll tinker with it some more later tonight and try and add test cases and any needed documentation.

Other problems:

evanpurkhiser commented 10 years ago

Hi again everyone!

I've just finished up adding test cases for my patch against Mutagen. I think it should fully support reading and writing ID3 tags to AIFF files now. Here are all of the cases that it handles:

Now my question for anyone who might be able to answer (@sampsyo?). What is the proper process for submitting this patch for code review. I'm very used to GitHubs pull requests, but it looks like the project is officially hosted on Google Code (and mirrored on BitBucket). Should I fork the repository on BitBucket and make a pull request there. Or is there something else I can do?

Here is the patch. Hopefully we can find someone to put it where comments can be made line by line.

Looking forward to seeing AIFF support in beets! :smile:

skunark commented 10 years ago

Wow, lots of good stuff here.

Now, a favor :) iTunes oddly populates the ID3 chunk, pun intended, which violates the AIFF spec which requires chunks to alway be even (all according the MPD developers). It would be amazing, if beets would correct this, typically you can see this issue after ripping one or two albums. If the file size is an odd number then you have an ID3 tag issue. Music Player Daemon isn’t a huge fan of AIFF with these odd chunks or ALAC with random glitches between songs that would make any person and house pet cry. I now maintain a second flac music database just for MPD.

I can pay in beer :) (root beer if you are under 21)

Thanks,

Jim

On Apr 24, 2014, at 10:33 PM, Evan Purkhiser notifications@github.com wrote:

Hi again everyone!

I've just finished up adding test cases for my patch against Mutagen. I think it should fully support reading and writing ID3 tags to AIFF files now. Here are all of the cases that it handles:

Reading ID3 tags from the ID3 chunk of course! Updating ID3 tags in the ID3 chunk. It will expand the chunk if nessicary and update the ID3 header size field and FORM header size field. Adding ID3 tags to a AIFF file that has no ID3 chunk. Again it handles updating the FORM header size field after inserting the new chunk. Removing ID3 tags from the AIFF file, done by completely deleting the ID3 chunk. Now my question for anyone who might be able to answer (@sampsyo?). What is the proper process for submitting this patch for code review. I'm very used to GitHubs pull requests, but it looks like the project is officially hosted on Google Code (and mirrored on BitBucket). Should I fork the repository on BitBucket and make a pull request there. Or is there something else I can do?

Here is the patch. Hopefully we can find someone to put it where comments can be made line by line.

Looking forward to seeing AIFF support in beets!

— Reply to this email directly or view it on GitHub.

evanpurkhiser commented 10 years ago

Do you have an example AIFF file from iTunes that is broken like you describe? That's silly that iTunes doesn't follow the spec since the spec was published by Apple!

skunark commented 10 years ago

Here is the example I provide for MPD, I can provide more if needed. Typically if you use MPD, just ripping one or two albums will do the trick. I’ve since moved on to python audio tools for FLAC duties with AccurateRip checks.

Troubled File: https://docs.google.com/file/d/0Bx20GGdQw_yQb0pZbl9TdDdRU0U/edit?pli=1

MPD bug report: http://bugs.musicpd.org/view.php?id=3743

Jim

On Apr 24, 2014, at 10:51 PM, Evan Purkhiser notifications@github.com wrote:

Do you have an example AIFF file from iTunes that is broken like you describe? That's silly that iTunes doesn't follow the spec since the spec was published by Apple!

— Reply to this email directly or view it on GitHub.

evanpurkhiser commented 10 years ago

Awesome! Thanks. The file is indeed missing the last pad byte required by the spec.

The good news is that it's missing the byte right at the end of the file if it was missing it before one of the chunks then it would be much more difficult to fix. Assuming iTunes only forgets to pad the byte at the end then it should be trivial to write a single null byte to the end of the file if it's uneven. It could probably even be a simple plugin to for beets.

This would probably make sense to have a new issue opened for it once AIFF support makes it's way into beets =)

sampsyo commented 10 years ago

This is totally awesome, @EvanPurkhiser! This seems fully functional to me, especially since it's got a substantial and very readable test suite.

We should coordinate with @lazka now. I believe the best way will be to open a pull request on BitBucket, which fortunately works mostly the same as GitHub (fork, push, press the PR button). I'd be happy to help with review and such; hopefully we can wrap this up quickly!

evanpurkhiser commented 10 years ago

Pull request opened =)

andrewgdunn commented 10 years ago

@skunark would you elaborate further about python audio tools for ripping?

evanpurkhiser commented 10 years ago

Mutagen 1.23 has been released with AIFF support. All that's left is for it to be integrated into beets!

LordSputnik commented 10 years ago

I've also pulled this into MutagenX 1.23 (available on PyPI). Most of the code was already pretty compatible, so nice job on that! Main change was that I added a few functions to validate chunk names a little more strictly.