mackron / dr_libs

Audio decoding libraries for C/C++, each in a single source file.
Other
1.24k stars 205 forks source link

dr_wav: Support for full metadata read/write #170

Closed SamWindell closed 3 years ago

SamWindell commented 3 years ago

This patch adds support for some extra wav metadata. These chunks are often useful for audio-editing software, software synthesizers and samplers. They are read and stored in the same fashion as the smpl was.

In addition to this, each of these new chunks (and the smpl chunk) now have a bool to say if the chunk was found and correctly read. Previously there was no way to tell if the chunk was valid or not.

I'd like to be able to get the LIST chunk, (with its subchunks) with dr_wav too. https://sites.google.com/site/musicgapi/technical-documents/wav-file-format

But it will require string allocations probably, or use fixed buffers (which is a perhaps a little limiting). Any ideas on how to go about that?

mackron commented 3 years ago

Thanks for this! I'll review this ASAP. Just a quick question - I saw you've limited the cue count to a maximum of 4. Is that a realistic scenario? (I don't do anything with cue points.) Certainly in the past I've rejected the idea of explicitly supporting cue points because I was concerned at the requirement for a heap allocation in the initialization routine.

SamWindell commented 3 years ago

Yeah, I was just thinking that the user would set their own define for the max cue points. But thinking about it further it's probably not going to be a good-enough solution. It could be anywhere from 1 to 50 in extreme cases. The smpl loop isn't ideal either though, it's almost always 1 but occasionally 2 or 3.

Perhaps this is all a bit ad-hoc for handling these chunks. Maybe a whole new system should be made. An extra step would be to have this new system allow for writing of chunks too.

Something like an opt-in system where the user says what types of chunks they are interested in. Then a single allocation is done and filled with all the asked-for chunk data and saved for them to access later on. What do you think? I might have a go at this.

mackron commented 3 years ago

Thanks for all your work on this. Is this ready for review yet or are you still working on things? Are you able to give me a high level overview of the changes?

SamWindell commented 3 years ago

No worries, I wanted these features for a project of mine too. I think there were a couple more minor things I want to take a look at first. I'll get back to you in a few days with an overview.

SamWindell commented 3 years ago

Ok here we go. In total, these changes add support for all the common types of wav metadata to be read and written. This is only for standard wav format and rf64 wav. Honestly, I haven't delved into WAV64 at all.

For reading metadata, there are new drwav_init functions that end with _with_metadata. These functions take a u32 int which is a bit-field used to represent the types of metadata that you are interested in. So by 'bitwise-or'ing together the drwav_metadata_type constants you can specify that you only care about smpl and cue metadata for example; this saves a time and memory.

The reading works by first iterating over all the chunks in the wav file in a 'counting' mode - counting the number of bytes the parser will need to malloc for the metadata. This data is then allocated in a single block. It then resets the cursor back to the start and iterates over the chunks again in a 'reading' mode, filling up the memory with the data from the chunks. This 2-pass approach I don't think is too slow and it has the advantage of only needing a single allocation which is easy to manage for the user.

The rough structure of this malloced block is that the drwav_metadata objects come first, and then after it are the various other required bits of data such as null-terminated strings. The drwav_metadata struct contains a union of all possible metadata types so can be easily handled as an array.

After the init_with_metadata call, the user just reads the metadata from a field in the drwav object. I found that I wanted this metadata to persist longer than drwav, so I added an API for take_ownership. If you don't take ownership of it, it is freed in drwav_unint.

Writing is pretty simple. There is a new function, drwav_init_write_with_metadata. You pass an array of metadata to it. It writes all metadata items into the file. It doesn't change the data and doesn't free anything.

A couple of unrelated things: apologies, a whitespace edit got committed into this. My editor removed the trailing whitespace from all lines. I also found a bug I believe - one that I am pretty sure is not related to these metadata changes. The bug is to do with calculating the size of a riff file. a066b8b. This fix should change the already existing utility function, drwav_target_write_size_bytes. Do you have any tests you can run as well?

I'm happy to make changes based on your review.

mackron commented 3 years ago

Thanks a lot for this work and your writeup - I appreciate the amount of work that went into this. I've only taken a cursory look so far, but there's a few styling things that will need to change. Just little things like open curly braces on functions and structs not being on a separate line, and "numMetadata" naming instead of "metadataCount" which is the convention used in all of dr_libs.

I'm happy with the _with_metadata() idea and it's consistent with dr_flac. Just a thought, though - I wonder if we really need the onChunk callback in this case considering the existence of drwav_unknown_metadata which, I'm assuming, acts as the catch-all fallback? If we were to remove that, the only different between the _ex and non-_ex versions is the flags parameter, in which case I reckon we could just not bother about the _ex version entirely and move the flags parameter over to the non-ex version. (I still want to keep the _ex version as-is for the base drwav_init() function, though.) Your thoughts on that?

I'm also happy with the malloc(), so long as it's just one bulk allocation (which it looks like you've done) and it's restricted only to _with_metadata(). It looks like drwav_init__private() is checking pWav->allowedMetadataTypes for non-0 which is the guard for this, I'm assuming? So the idea is that drwav_init() will just set this to 0, thereby skipping the metadata parsing stuff and in turn the malloc()? The 2-pass approach is fine and is my preference.

No worries about the whitespace stuff - I should probably do a whitespace strip of all my libraries anyway. Also no worries about the W64 stuff. That's just a niche thing that can be dealt with when the need arises. There's also RF64 which is the same deal.

There's one thing I'm still undecided on, though, and that's the metadata type filtering. Is the performance stuff an actual real-world problem with measurable and significant consequences, or is it a handwavey hypothetical problem? I'm not entirely sure, but I usually err towards the latter with that kind of stuff. I don't completely hate the idea or anything, it's just that API simplicity is always at the top (or near the top) of my priorities.

Just a quick error a noticed in my scan. The drwav_init_with_metadata() declaration in the header section is missing the drwav_uint64 allowedMetadataTypes parameter (last parameter).

As I go through this in detail I'll go ahead and push changes to your branch - I assume you're fine with that? And I assume you're fine with all of this going into public domain?

SamWindell commented 3 years ago

Thanks a lot for this work and your writeup - I appreciate the amount of work that went into this. I've only taken a cursory look so far, but there's a few styling things that will need to change. Just little things like open curly braces on functions and structs not being on a separate line, and "numMetadata" naming instead of "metadataCount" which is the convention used in all of the dr_libs.

No worries, thanks for writing the dr_libs!

I'm happy with the _with_metadata() idea and it's consistent with dr_flac. Just a thought, though - I wonder if we really need the onChunk callback in this case considering the existence of drwav_unknown_metadata which, I'm assuming, acts as the catch-all fallback? If we were to remove that, the only different between the _ex and non-_ex versions is the flags parameter, in which case I reckon we could just not bother about the _ex version entirely and move the flags parameter over to the non-ex version. (I still want to keep the _ex version as-is for the base drwav_init() function, though.) Your thoughts on that?

You're right - let's simplify that a little and drop the _ex version. The unknown metadata type is a good alternative to onChunk. Is the flags parameter just used for specifying sequential mode? If so, then it actually won't serve any purpose because the metadata parser can't work if it's not allowed to seek backwards through the file.

I'm also happy with the malloc(), so long as it's just one bulk allocation (which it looks like you've done) and it's restricted only to _with_metadata(). It looks like drwav_init__private() is checking pWav->allowedMetadataTypes for non-0 which is the guard for this, I'm assuming? So the idea is that drwav_init() will just set this to 0, thereby skipping the metadata parsing stuff and in turn the malloc()? The 2-pass approach is fine and is my preference.

All correct!

There's one thing I'm still undecided on, though, and that's the metadata type filtering. Is the performance stuff an actual real-world problem with measurable and significant consequences, or is it a handwavey hypothetical problem? I'm not entirely sure, but I usually err towards the latter with that kind of stuff. I don't completely hate the idea or anything, it's just that API simplicity is always at the top (or near the top) of my priorities.

Yeah we could do away with it and it would simplify the API. I'm happy for that to happen. The performance aspect is indeed hypothetical, there are a few large-ish chunks out there but I doubt it will make any difference at all. I guess one nice thing about filtering the types is that you can specify which types you understand and care about and then after you have done processing a wav file, you can pass that same metadata back to the write function, without having to filter out the things you don't care about yourself. Perhaps a niche case though! I believe we will still need to keep an enum of all possible metadata types somewhere though. It will be needed for determining the metadata union type.

Just a quick error a noticed in my scan. The drwav_init_with_metadata() declaration in the header section is missing the drwav_uint64 allowedMetadataTypes parameter (last parameter).

Yep sounds like something I'd missed.

As I go through this in detail I'll go ahead and push changes to your branch - I assume you're fine with that? And I assume you're fine with all of this going into public domain?

Yep to both! Let me know if you need anything else.

mackron commented 3 years ago

OK, cool, it looks like we agree on most things. I'm still undecided on the filtering thing, but I'll make a decision. Will review and update this branch as soon as I get a chance.

SamWindell commented 3 years ago

BTW, just fixed a little compile error to do with the alignof macro when compiling the header as C++ with Clang.

mackron commented 3 years ago

I haven't yet had a chance to look at this, I'm sorry (my mind is focused on some miniaudio stuff right now). However, that DRWAV_ALIGNOF macro is not something I'll be accepting into the code base. I know it's not your fault or anything, but that C++ code is absolutely disgusting and is something I refuse to maintain and accept into my code base. What's that macro used for, and why is it necessary? I would rather come up with a solution to remove it entirely.

SamWindell commented 3 years ago

No rush on my part, I'm happy using my own fork for the time being.

Fair enough. I'd personally just use the C++11 or C11 alignof operator, but I understand this library needs to work on older compilers too. The macro is used to ensure the structs are correctly aligned within the single malloc block for the metadata. The block contains strings that can be any length as well as structs so I'm pretty sure some sort of alignment is needed. I guess the allocations for strings could be rounded up to a multiple of 16 bytes? I can't say for sure if that would cut it though.

mackron commented 3 years ago

I've finally got around to reviewing this one. I've synchronized this with the dev branch which had some conflicts. Hopefully I didn't accidentally break anything with that. I've made the following changes:

There will be changes here that will affect your own code base so just be careful about synchronizing until you've got a proper chance to take a look. I'll probably push a few more little updates to this branch soon.

Thanks for your work on this.

SamWindell commented 3 years ago

Great stuff! Thanks for reviewing it.

I'll add this new version into my project probably on the weekend and have a test.

SamWindell commented 3 years ago

All seems to be working well. I've added your changes to my project and run my tests. They certainly don't cover every aspect of testing the metadata, but it's something.

I found a bug and fixed it. It wasn't related to your changes.

mackron commented 3 years ago

Thanks for giving that a test. It's no problem if we can't test every case - the community can let us know of any issues. This changes a public facing API so I'll be bumping the minor version number. I've created a branch for version 0.13 called dev-wav-0.13 which is where this PR will be merged into. It shouldn't be too long before I release this proper.

mackron commented 3 years ago

I've merged this work into the dev branch. The next release of dr_wav will be version 0.13.0, and it'll include this. Indeed, this is probably going to be the only new thing in that release. Thanks again for your work on this!