eshaz / wasm-audio-decoders

Browser and NodeJS Web Assembly audio decoder libraries that are highly optimized for size and performance.
174 stars 19 forks source link

partial content streaming & frameLength #85

Closed felixroos closed 1 year ago

felixroos commented 1 year ago

Hello and first of all thank you for creating this library, it is really easy to use and super fast!

I am running into a problem when trying to manually stream an mp3 file that is located on a server, using multiple fetch requests with equally sized Range headers. For each request, I am decoding the buffer using decode, then turning it into an audio buffer for playback on the web:

// fetch arrayBuffer inside range
let { arrayBuffer } = await fetchAudioBuffer(ac, audioUrl, start, end);
// decode array buffer and convert it to an audio buffer
const { samplesDecoded, sampleRate, audioBuffer } = await arrayBufferToAudioBuffer(arrayBuffer);
// calculate bytes 
const totalBytes = (bitRate / 8 / sampleRate) * samplesDecoded;

That all works well, except the fact that the samplesDecoded have different lengths, although my Range headers have a fixed size (except maybe the last range header, but let's ignore that for now). For example, if I request 40000 bytes per request, the value of samplesDecoded varies, and similarly, the totalBytes oscillate around the 40000 mark.

If I sequentially play the buffers, I get glitches, which are probably due to missing data at the transition between chunks. I don't fully understand how the mp3 format works, but I've learned that they are split into frames. I assume that the problem is that my byterange does not align with the "grid" of frames, so if I set start to a byte that is in the middle of a frame, the decode method probably starts with the next frame, throwing away the end of the partial frame. The same probably happens for the last frame, when the end is in the middle of a frame.

So my questions are:

I can probably solve the problem when I get the frameLength from the first call to decode, and adjust the byte range for all subsequent requests accordingly..

Does that make sense? Is there another way to solve this?

Greetings!

eshaz commented 1 year ago

MP3 cannot be split by frames and have each chunk properly decode individually. There is a bit reservoir that persists across frame boundaries that prevents this. See https://stackoverflow.com/a/56139965

Problems will occur if you try to decode MP3 frames out of order, try to seek using byte ranges, or create a new instance each time you want to decode.

However, you can decode arbitrarily sized chunks in sequence using decode. mpg123-decoder will persist the decode state across invocations of decode provided you use the same instance.

If you're interested in seeking, i.e. decoding anything out of sequence, you might be able to get away with creating a new instance each time you seek. There will a short gap at the beginning of the audio though, since MP3 takes a few frames to populate the bit reservoir before outputting audio.

felixroos commented 1 year ago

Thanks for the quick response

Problems will occur if you try to decode MP3 frames out of order, try to seek using byte ranges, or create a new instance each time you want to decode.

I am trying in order so it should theoretically work

I've just created a minimal reproduction here, with the main logic here It's just a simple node server that handles the partial content request + a client that requests all the chunks and strings them together. It kind of works but there are glitches happening at the buffer transitions.

Is there something I am doing wrong?

eshaz commented 1 year ago

I looked through your code and it looks like it should work just fine. The glitches might be caused by a problem with the timing when scheduling the sources to play.

I use these decoders in another library I maintain, icecast-metadata-player, which enables streaming audio playback from an Icecast stream. This is where I schedule the source buffers to start playing. You might need to account for the AudioContext duration offset like I'm doing here: https://github.com/eshaz/icecast-metadata-js/blob/7c234e44f9a361b92c83203b9e03b4177ecf7a21/src/icecast-metadata-player/src/players/WebAudioPlayer.js#L286-L303

felixroos commented 1 year ago

It now works with the offset calculation :)

source code: https://github.com/felixroos/mp3-streamer/

demo: https://mp3-streamer.netlify.app/

thank you for the help!!!