gpac / mp4box.js

JavaScript version of GPAC's MP4Box tool
https://gpac.github.io/mp4box.js/
BSD 3-Clause "New" or "Revised" License
1.94k stars 326 forks source link

[question] MP4 audio to AudioBuffer #299

Open bengfarrell opened 1 year ago

bengfarrell commented 1 year ago

Hi all, Does anyone have any examples of reading and maybe queueing samples from MP4Box and creating an AudioBuffer from these? Currently I have something that creates EncodedAudioChunks and it works great, but I'm having trouble using Web Codecs on the audio side to work well with AudioBuffers from elsewhere, so I'm thinking that I'd like to avoid using Web Codecs on the audio side. Unfortunately I fell short trying to process these samples without making the EncodedAudioChunks (other than decoding them all with Web Codecs) Thanks!

hughfenghen commented 1 year ago

FYI

const audioBuf = await audioCtx.decodeAudioData(data)
const frameCnt = audioBuf.sampleRate * audioBuf.duration * 2
const audioDataBuf = new Float32Array(frameCnt)
const chan0Buf = audioBuf.getChannelData(0)
audioDataBuf.set(chan0Buf, 0)
if (audioBuf.numberOfChannels >= 2) {
  audioDataBuf.set(audioBuf.getChannelData(1), chan0Buf.length)
} else {
  audioDataBuf.set(chan0Buf, chan0Buf.length)
}

const ad = new AudioData({
  numberOfChannels: 2,
  numberOfFrames: audioBuf.sampleRate * audioBuf.duration,
  sampleRate: audioBuf.sampleRate,
  timestamp: 0,
  format: 'f32-planar',
  data: audioDataBuf
})
bengfarrell commented 1 year ago

Thanks so much @hughfenghen! I've tried variations of this with no luck. I've heard from other folks it's more complicated than just doing what you outlined.

So the data I'm getting back from mp4box.js is a Uint8 buffer. The audio decoder method you've called out expects an ArrayBuffer. Of course I can get mydata.buffer, but then it complains:

Uncaught (in promise) DOMException: Failed to execute 'decodeAudioData' on 'BaseAudioContext': Unable to decode audio data

It works perfectly as an EncodedAudioChunk, which can be decoded with the WebCodec API, but not using features of the Web Audio API.

Perhaps there's a better way to get data where I wouldn't have this issue, but I'm not aware of it. My usage of mp4box grabs the audio as samples from the file like so:

this.file = MP4Box.createFile(); this.file.onError = console.error.bind(console); this.file.onReady = this.onReady.bind(this); this.file.onSamples = this.onSamples.bind(this);

And then the callback for onSamples passes the audio data that I can't use outside of the Web Codec API.

hughfenghen commented 1 year ago

@bengfarrell The previous example is converting AudioBuffer to AudioData, which shows that the data of both is similar, you can also convert AudioData to AudioBuffer.

If you want AudioBuffer: EncodedAudioChunk -> AudioDecoder.decode -> AudioData -> AudioBuffer.
The reverse of this process is to convert AudioBuffer to EncodedAudioChunk.

Convert AudioData to AudioBuffer:

  1. extract the base data (ArrayBuffer) of AudioData
  2. write the ArrayBuffer to the newly created AudioBuffer

Step 1

export function extractAudioDataBuf (ad: AudioData): ArrayBuffer {
  const bufs: ArrayBuffer[] = []
  let totalSize = 0
  for (let i = 0; i < ad.numberOfChannels; i++) {
    const chanBufSize = ad.allocationSize({ planeIndex: i })
    totalSize += chanBufSize
    const chanBuf = new ArrayBuffer(chanBufSize)
    ad.copyTo(chanBuf, { planeIndex: i })
    bufs.push(chanBuf)
  }

  const rs = new Uint8Array(totalSize)
  let offset = 0
  for (const buf of bufs) {
    rs.set(new Uint8Array(buf), offset)
    offset += buf.byteLength
  }

  return rs.buffer
}

I haven't written the code for the second step, but I think it should be possible, you need to refer to AudioBuffer.copyToChannel.

bengfarrell commented 1 year ago

@hughfenghen Thanks again - but I think I'm not quite getting across what I'm looking for. So I want to go from the onSamples listener of MP4Box directly to something that the Web Audio API can consume. I currently have a solution like what you just mentioned in place if I understand you correctly. I'm going from MP4Box, getting samples, setting that as the value of EncodedAudioChunk and decoding successfully from there.

The problem is that Safari doesn't seem to be supporting the audio part of the Web Codec API, so that is a Chrome only solution.

So to sum up - I'd like to

To rephrase this last step: MP4box gives me encoded audio data. Without using EncodedAudioChunk, how can I make this data playable, knowing that AudioContext.decodeAudioData doesn't seem to work with this data directly?

Apologies if I'm missing something, but I do feel like you haven't quite understood my problem yet. Thanks much for your help and patience!

hughfenghen commented 1 year ago

@bengfarrell Ok, I understand.

You need to decode the MP4 audio sample, but you can't use the WebCodecs API.
Apparently BaseAudioContext.decodeAudioData can't decode the audio sample either, because:

This method only works on complete file data, not fragments of audio file data.

I guess there are two ways to solve your problem:

  1. add a corresponding header or meta to the audio sample.data, then use BaseAudioContext.decodeAudioData
  2. use other third-party library software decode (aac, opus, etc.) sample.data to PCM data, then write to AudioBuffer

I am not an expert in this area, the above is only my guess.

bengfarrell commented 1 year ago

@hughfenghen Thanks for trying! Yeah, that's basically where I ended up. Not enough of an expert on audio encoding/data to know what header data to add.