Is there a way to output `Float32Array` instead of `UInt16Array`?

Yahweasel / libav.js

This is a compilation of the libraries associated with handling audio and video in ffmpeg—libavformat, libavcodec, libavfilter, libavutil, libswresample, and libswscale—for emscripten, and thus the web.

288 stars 18 forks source link

Is there a way to output `Float32Array` instead of `UInt16Array`? #12

Closed maximedupre closed 1 year ago

maximedupre commented 1 year ago

I'm using the decode function from this test.

It seems like ret.data (ret being a LibAVJS.Frame) always generates some UInt16Array. I think this is causing my data to have a distorted bass, which can be caused by the loss of low-frequency information when converting from UInt16Array to Float32Array 😶

I'm using this snippet I found for conversion:

int16ToFloat32V2(inputArray: Uint16Array) {
    const float32Array = new Float32Array(inputArray.length);

    for (let i = 0; i < inputArray.length; i++) {
        float32Array[i] = inputArray[i] / 32768;
    }

    return float32Array;
}

Is there a way to tell the lib to output [Float32Array, Float32Array] for ret.data?

The lib seems to have this constant AV_SAMPLE_FMT_FLT: number;, but I'm unsure if it's useful or where to use it 😁

Thanks in advance for any guidance you might provide 😊🙏

Edit 1

I need Float32Array because apparently, that's what the Web Audio API expects (e.g. channel data for an AudioBuffer)

Yahweasel commented 1 year ago

The decoder isn't actually doing any conversion here, and that's the issue you're running into. You're getting a Uint16Array because your original data was 16-bit. Pass in, for instance, and Opus file and you'll get 32-bit floats; you'll get different data types depending on your input format.

To get libav to give you an arbitrary format of your choosing, you need to filter it. See, for instance https://github.com/Yahweasel/libav.js/blob/4894d5459ad9d14bd8a05fdcfc21582ae788de1a/tests/test-filtering-audio-simple.js#L55 . If all you care about is the data format (and sample rate, number of channels, etc), then the only filter you want is "anull", as the format conversion filter is added automatically. ff_init_filter_graph takes the input format specification and the output format specification, but every component of the input format specification is optional, so you can just specify what you need for the output and it will convert anything you throw at it.

Yahweasel commented 1 year ago

Oh, by the way, I think the reason you were losing information with your simple conversion is that whatever your input format was was in unsigned int, so you would actually want (inputArray[i] - 32768) / 32768, but letting libav do the filter itself is still usually the better way.

maximedupre commented 1 year ago

@Yahweasel Thank you so much for the help 🥹! Using the test-filtering-audio-simple.js test that you referenced I was able to figure out how to convert my data to Float32Array 🙏

However, it was interleaved data instead of planar data, but isolating the two channels was easy enough 😁💪

let offset = 0;

for (const f of frames) {
    const leftChannelData = new Float32Array(f.data.length / 2);
    const rightChannelData = new Float32Array(f.data.length / 2);

    for (let i = 0; i < f.data.length; i += 2) {
        leftChannelData[i / 2] = f.data[i];
        rightChannelData[i / 2] = f.data[i + 1];
    }

    buffer.copyToChannel(leftChannelData, 0, offset);
    buffer.copyToChannel(rightChannelData, 1, offset);

    offset += f.data.length / 2;
}

Yahweasel commented 1 year ago

You can, in fact, have libav deinterleave as well. Use FLTP instead of FLT, and it'll be a Float32Array[] instead of a Float32Array.

By sheer coincidence, I recently had to write some code that does the same thing, so if you'd like another example, here's one: https://github.com/ennuicastr/rtennui/blob/f3b5167cba63fc546d318773752bfe13aa2325d1/src/audio-capture.ts#L364