Signalsmith-Audio / signalsmith-stretch

C++ polyphonic pitch/time library (GitHub mirror)
https://signalsmith-audio.co.uk/code/stretch/
MIT License
254 stars 24 forks source link

Sample ordering #2

Open lucmans opened 10 months ago

lucmans commented 10 months ago

Normally, when working with (a stream of) samples, the channels are interleaved (L1R1L2R2...LnRn). Your library expects all the samples of one channel to be contiguous (L1L2...LnR1R2...Rn).

In my experience, audio is mostly handled in the interleaved format, forcing devs using this lib to adapt their buffers (or access to its content). I would suggest adding an interleaved interface to your lib.

mdabbs commented 9 months ago

You could write a class that overloads the [] operator and then take the provided index to that method, double it, then use your channel index. I've thought about doing this, but since my samples are signed 16-bit ints, I chose not to go this way because the lib accesses the inputs and outputs multiple times which means I would be scaling them back and forth multiple times. I chose the buffer filling technique instead because memory is cheaper than mults/divs and I already have to fill a buffer anyway, might as well fill it with only scaling once.

lucmans commented 9 months ago

If anyone is interested in a class which provides non-interleaved like access to an interleaved buffer:

// exposes an interleaved sample buffer as a buffer where all samples of a channel are stored contiguously
class FakeDeinterlace {
    public:
        // helper class resolving the second subscript operator
        class ProxyDeinterlace {
            public:
                ProxyDeinterlace(const FakeDeinterlace& _p, const int _offset)
                    : p(_p), offset(_offset) {}

                float& operator[](const int index) {
                    return p.audio[(index * p.n_channels) + offset];
                }

            private:
                const FakeDeinterlace& p;
                const int offset;
        };

        FakeDeinterlace(float* const _audio, const int _n_channels)
            : audio(_audio), n_channels(_n_channels) {}

        ProxyDeinterlace operator[](const int channel) {
            return ProxyDeinterlace(*this, channel);
        }

    private:
        float* const audio;
        const int n_channels;
};

Note it doesn't perform any bound checking etc.

Sample code showing it's behavior:

#include <cstdlib>  // EXIT_SUCCESS
#include <iostream>
#include <iterator>  // std::size()

int main(int argc, char *argv[]) {
    // channel interlaced audio data
    // integer part represents the channel, the fractional part the index of the sample
    float audio[] = {
        1.0, 2.0,
        1.1, 2.1,
        1.2, 2.2,
        1.3, 2.3,
        1.4, 2.4,
        1.5, 2.5,
        1.6, 2.6,
        1.7, 2.7,
        1.8, 2.8,
        1.9, 2.9,
    };
    const int n_samples = std::size(audio);
    const int n_channels = 2;
    static_assert(n_samples % n_channels == 0, "not every channel has the same number of samples");
    const int n_samples_per_channel = n_samples / n_channels;

    const int offset = 3;  // simulate access in the middle of the sample data
    FakeDeinterlace fd(&audio[offset * n_channels], n_channels);
    for (int i = 0; i < n_channels; ++i) {
        for (int j = 0; j < n_samples_per_channel - offset; ++j) {
            std::cout << fd[i][j] << " ";
        }
        std::cout << std::endl;
    }

    return EXIT_SUCCESS;
}

The variable fd can also be passed to stretch.process().