chcunningham / wc-talk

MIT License
44 stars 5 forks source link

Example on PCM data #8

Closed onthegit closed 2 years ago

onthegit commented 2 years ago

Hello, thank you for the examples regarding the webcodecs apis.

I am trying to construct AudioData object from raw PCM data (retrieved from wav file). I have the following (parsed from the header of the WAV file):

audioFormat: 1
bitsPerSample: 16
blockAlign: 4
byteRate: 176400
cbSize: 0
chunkId: "fmt "
chunkSize: 16
dwChannelMask: 0
headBitRate: 0
headEmphasis: 0
headFlags: 0
headLayer: 0
headMode: 0
headModeExt: 0
numChannels: 2
ptsHigh: 0
ptsLow: 0
sampleRate: 44100

And the actual Linear PCM data in ArrayBuffer.

How to construct the AudioData object from the ArrayBuffer so that it can be passed for encoding to "opus" codec.

 const chunk = new AudioData({
          data: data,
          timestamp: 0,
          format: 'f32',//not sure what the format is
          numberOfChannels: 2,
          numberOfFrames: 1024,//again this number is arbitrary, do not know what to put here
          sampleRate: 44100,
        })

How to construct the rest of the ArrayBuffer into AudioData? The data.byteLength = 10406468

AudioEncoder config:

   {
      codec: 'opus',
      numberOfChannels: 2,
      sampleRate: 44100
    }

Thanks.

guest271314 commented 2 years ago

An example of resampling 44100, 2 channel WAV input to 48000 sample rate Float32Arrays using AudioContext (see https://github.com/w3c/webcodecs/issues/378), encoding to WebCodecs 'opus', playing back EncodedAudioChunk's with Media Source Extensions.

<!DOCTYPE html>
<html>
  <head></head>
  <body>
    <input type="file" />
    <script>
      document.querySelector('input[type=file]').onchange = async (e) => {
        const wav = await e.target.files[0].arrayBuffer();
        const ac = new AudioContext({
          sampleRate: 48000,
        });
        let timestamp = 0,
          array = [],
          chunks = [],
          config = {
            numberOfChannels: 2,
            sampleRate: 48000,
            codec: 'opus',
          };
        const ab = await ac.decodeAudioData(wav);
        const left = ab.getChannelData(0);
        const right = ab.getChannelData(1);
        console.log(left, right);
        for (let i = 0; i < left.length; i += 480) {
          const data = new Float32Array(480 * 2);
          data.set(left.subarray(i, i + 480), 0);
          data.set(right.subarray(i, i + 480), 480);
          const frame = new AudioData({
            timestamp,
            data,
            sampleRate: 48000,
            format: 'f32-planar',
            numberOfChannels: 2,
            numberOfFrames: 480,
          });
          timestamp += frame.duration;
          array.push(frame);
        }
        console.log(array);
        const encoder = new AudioEncoder({
          error(e) {
            console.log(e);
          },
          output: async (chunk, metadata) => {
            if (metadata.decoderConfig) {
              config.description = metadata.decoderConfig.description;
            }
            chunks.push(chunk);
          },
        });
        console.log(await AudioEncoder.isConfigSupported(config));
        encoder.configure(config);
        for (const audioData of array) {
          encoder.encode(audioData);
        }
        await encoder.flush();
        console.log(chunks);
        const audio = new Audio();
        audio.controls = true;
        const events = [
          'loadedmetadata',
          'loadeddata',
          'canplay',
          'canplaythrough',
          'play',
          'playing',
          'pause',
          'waiting',
          'progress',
          'seeking',
          'seeked',
          'ended',
          'stalled',
          'timeupdate',
        ];
        for (const event of events) {
          audio.addEventListener(event, async (e) => {
            if (e.type === 'ended') {
            }
            if (e.type === 'loadedmetadata') {
              console.log(e.type);
              await audio.play();
            }
          });
        }
        document.body.appendChild(audio);

        const ms = new MediaSource();
        ms.addEventListener('sourceopen', async (e) => {
          console.log(e.type, config);
          URL.revokeObjectURL(audio.src);
          const sourceBuffer = ms.addSourceBuffer({
            audioConfig: config,
          });
          console.log(ms.activeSourceBuffers);
          sourceBuffer.onupdate = (e) => console.log(e.type);
          sourceBuffer.mode = 'sequence';
          for (const chunk of chunks) {
            await sourceBuffer.appendEncodedChunks(chunk);
          }
        });
        audio.src = URL.createObjectURL(ms);
      };
    </script>
  </body>
</html>

https://plnkr.co/edit/e2yskooQbI2gomFq?open=lib%2Fscript.js

guest271314 commented 2 years ago

@onthegit FWIW WebCodecs 'opus' configuration and EncodedAudioChunks encoded in a single file, decoded to WAV and/or Media Source Extsensions for playback. From tests the encoded file is less than Opus in WebM container produced by Chromium implementation of MediaRecorder for the same duration of media (1 or 2 channels) https://github.com/guest271314/WebCodecsOpusRecorder.

onthegit commented 2 years ago

@guest271314 thanks. So you are saying that I have to first resample the 44100 to 48000 before creating AudioData?

guest271314 commented 2 years ago

That would make the process simpler. Read the Web Audio API linked issues in the WebCodecs issue.

guest271314 commented 2 years ago

@onthegit What are you trying to do with the AudioData objects?

onthegit commented 2 years ago

@guest271314 thanks for commenting. I will be encoding the AudioData to opus codec with webcodecs AudioEncoder. I do not use AudioContext because I will be encoding in worker.

After doing more research on encoding, I managed to get this working so I am closing the issue, but if someone wants to add to the discussion or ask question, can reply to this thread.

There is no need of resampling, as the opus AudioEncoder buffers the audio data until 60000 duration is reached and resamples the data to 48000 cause that is the output of the AudioEncoder for opus codec.

guest271314 commented 2 years ago

There is no need of resampling, as the opus AudioEncoder buffers the audio data until 60000 duration is reached

I don't think that is the case.

I was able to get 44100 sample rate, 2 channel AudioData to play as EncodedAudioChunk only when the original AudioData duration is 60000. When duration is 10000 the playback is clipped.

guest271314 commented 2 years ago
{
      codec: 'opus',
      numberOfChannels: 2,
      sampleRate: 44100
    }

Note, when AudioEncoder configuration is set to 44100, with AudioData input duration set to 60000 the first second of playback using Media Source Extensions is clipped; and 1 second is trimmmed from end of playback; reduced quality. From this Opus does not support 44100 sample rate https://github.com/xiph/opus/issues/43.