udoprog / audio

A crate for working with audio in Rust
Apache License 2.0
78 stars 11 forks source link

Abstracting over sample format? #34

Open HEnquist opened 11 months ago

HEnquist commented 11 months ago

I frequently get questions about how to use my resampling library to resample audio data that is in integer format, often i16 but it varies. My resampler (as most other non-trivial signal processing) must work on float samples. So I need an abstraction layer that lets me read and write float samples to and from input and output buffers, no matter what layout and sample format the actual data buffer is using. Using audio buffers for input and output solves this for the layout, but it doesn't help with the sample format. Are there plans to support this in audio? Is it even possible with the current design?

I have experimented with implementing a solution for this, which ended up as this: https://github.com/HEnquist/audioadapter-rs This solves the problem, but with the obvious downside that it's a completely different solution than the audio buffers. It does however implement simple wrappers for audio buffers, so a project using audioadapter-rs would be able to also use audio buffers.

Initially I was planning on using audio buffers, but since it only solves half my problem I'm not sure about that any more. So what are the plans? I also see that there hasn't been much activity here lately which is a little worrying.

udoprog commented 11 months ago

So the audio can't help you abstract over all supported sample formats, because you have sample-format specific accelerators for your use case which this crate can't implement. However, you can probably add more internal implementations to provide a broader overlap. And if there is an overlap in traits, you can use rubato with any sample implementing audio::Sample and vice versa.

This crate is careful to put as few bounds as possible on audio buffers (note that they're usually either T: Copy or T: Sample which have broad implementations). If there is an interactivity problem (beyond the crate being in alpha release) I'd like to hear about it.

HEnquist commented 11 months ago

Rubato will never support anything other than floats internally. I don't think it's unique in that sense, I rather think this is a pretty common limitation for things doing non-trivial math. What I need is a way to seamlessly convert whatever format the samples are stored in to floats (and then from floats to whatever the user needs afterwards). Things get extra annoying when the original format is something like 24 bit integers. They may be stored without padding as three byte, or with padding as four. Both are very common. I suppose [u8; 3] can be used as a sample in audio, but how do you do any sort of math with it?

Of course it's always possible to copy and convert samples to a new buffer before using them, but this is exactly what I'm hoping to avoid.

HEnquist commented 11 months ago

One way that could work would be to add a converting layer on top of audio. Then audio will take care of the data layout, and then the converter would deal with converting the sample values. I have this conversion implemented for plain slices of integers here: https://github.com/HEnquist/audioadapter-rs/blob/master/src/integers.rs, and slices of bytes here: https://github.com/HEnquist/audioadapter-rs/blob/master/src/bytes.rs I would like to implement something similar for audio buffers. Both for buffers of plain integers, as well as buffers of byte arrays (for 24 bit ints etc). This would make the audio buffers considerably more useful IMO. I made a quick attempt but got stuck on the wrapping struct when I add a borrow of the audio buffer.

buf: &'a dyn Buf<Sample = [u8; $bytes]>,

this gives an error saying I also need to specify the associated types Channel and Iter. That would really limit things, can something smarted be done?

HEnquist commented 11 months ago

I got the converting wrapper working, just needed to use an intermediate trait without associated types.

I tried using [u8 ;3] as sample type for an audio buffer, but that isn't supported. Is this something that could be added?

udoprog commented 11 months ago

Definitely

HEnquist commented 11 months ago

See https://github.com/udoprog/audio/pull/35/files

HEnquist commented 11 months ago

Here is an example of using the converting wrapper of audioadapter-rs to do on the fly conversion from an i16 audio buffer to float:

        let data: [i16; 6] = [0, i16::MIN, 1 << 14, -(1 << 14), 1 << 13, -(1 << 13)];
        let buffer = wrap::interleaved(&data, 2);
        let converter: ConvertI16<&dyn Adapter<i16>, f32> =
            ConvertI16::new(&buffer as &dyn Adapter<i16>);
        assert_eq!(converter.read_sample(0, 0).unwrap(), 0.0);
        assert_eq!(converter.read_sample(1, 0).unwrap(), -1.0);
udoprog commented 11 months ago

We can definitely build out an equivalent Adapter / DynBuf trait which is object safe and foregoes all the things of Buf and the like can't use (e.g. iterators returning associated types). Since we already have a Translate trait, that would seem like all that's needed for that use case, and maybe one or two convenience wrappers.

HEnquist commented 11 months ago

The Translate trait looks like it's built for translating between types. Fine for simple numeric types, but things get tricky when a type can be more than one sample format. Like [u8; 4]. That little array could be a whole bunch of different things, like a big-endian f32, a little-endian i32, or even a i24 with a padding byte. (This one is really common. There are even weirder things like i20 and i18 with 12 and 14 padding bits, but those are probably obsolete enough to ignore.) So there needs to be a way to specify what the [u8; 4] contains.

HEnquist commented 11 months ago

Thanks for merging https://github.com/udoprog/audio/pull/35! Now I'm thinking of how to continue. Audio apis often work with buffers of bytes, even when the sample format is something "easy" like i32. How about adding maybe a separate wrap_bytes! macro that transmutes the data from bytes to the actual format? It's not hard to do that before wrapping, but it could be nice to not have to do this sort of thing yourself:

    // Create a view of the data with as a slice of i32
    let data_view = unsafe {
        let ptr = byte_data.as_mut_ptr() as *mut i32;
        let len = byte_data.len();
        std::slice::from_raw_parts_mut(ptr, len / 4)
    };