ai / audio-recorder-polyfill

MediaRecorder polyfill to record audio in Edge and Safari
https://ai.github.io/audio-recorder-polyfill/
MIT License
582 stars 79 forks source link

How to properly concat WAV blobs? #7

Open dontfireme opened 6 years ago

dontfireme commented 6 years ago

I know, that there is a problem with WAV, due to:

WAV format contains duration in the file header. As result, with timeslice or requestData() call, dataavailable will receive a separated file with header on every call. In contrast, MediaRecorder sends header only to first dataavailable. Other events receive addition bytes to the same file.

But do you have any idea how to properly concat WAV blobs? I'm following docs:

const mediaRecorder = new MediaRecorder(stream, { mimeType: "audio/wav" });

mediaRecorder.addEventListener("dataavailable", (e) => {
    chunks.push(e.data);
})

then somewhere:

mediaRecorder.stop();
const blob = new Blob(chunks);
const url = URL.createObjectURL(blob);
audio.src = url;

and the duration of the recorded audio is everytime 1sec. Is it possible to solve this problem somehow?

ai commented 6 years ago

@dontfireme oops, sorry. I missed your issue (had a really hard time). Don’t worry to ping maintainer in future issues. Issues without an answer is a shame for open source community (shame on me 😅).

There is no easy way since WAV header contains file length (which is unknown when we generate non-last blob).

But, in another hand, it is not so complicated. In Audio Recorder Polyfill every “blob” is a full WAV file. So there is two ways:

ai commented 6 years ago

Here is a solution for Node.js (backend):

ffmpeg -i 1.wav -i 2.wav -i 3.wav -i 4.wav output.wav 

I think the simplest way for browser JS is to create multiple audio tags or do not use blobs. Sorry, it is a limit of WAV file (as I mentioned above WAV file must have file length in the header, this is why I can’t generate WAV blobs, only separated WAV files).

The best solution is to add OGG encoder. OGG file doesn’t need a file length in header, as result we can generate blobs and go cool file = blob1 + blob2 + blob3 + ….

floydback commented 4 years ago

If you have to concat WAV blobs in browser you can decode each blob to audio buffer with decodeAudioData method. Then concat buffers, see this audio-buffer-utils

anrikun commented 4 years ago

The best solution is to add OGG encoder. OGG file doesn’t need a file length in header, as result we can generate blobs and go cool file = blob1 + blob2 + blob3 + ….

Where to find this OGG encoder?

ai commented 4 years ago

@anrikun we do not have built-in yet.

You can try to use MP3 encoder (but I am not sure that you can simply concat files). https://github.com/ai/audio-recorder-polyfill#mp3

guest271314 commented 3 years ago

You can slice the first 44 bytes from the discrete WAV file then concatenate the raw PCM into a single file. In brief see https://github.com/guest271314/audioInputToWav, https://github.com/guest271314/AudioWorkletStream/blob/master/audioWorklet.js#L19.

guest271314 commented 3 years ago

To play the audio at an HTMLMediaElement without creating a discrete WAV file you can use

      function int16ToFloat32(inputArray) {
        const output = new Float32Array(inputArray.length);
        for (let i = 0; i < output.length; i++) {
          const int = inputArray[i];
          // If the high bit is on, then it is a negative number, and actually counts backwards.
          const float =
            int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
          output[i] = float;
        }
        return output;
      }

      const mediaElement = document.querySelector('audio');

      ;(async (chunks) => {
        const ac = new AudioContext({
          sampleRate: 22050,
          latencyHint: 1,
        });

        const uint16 = new Uint16Array(
          await new Blob(
            await Promise.all(
              chunks.map((file) => file.slice(44).arrayBuffer())
            )
          ).arrayBuffer()
        );

        const floats = int16ToFloat32(uint16);

        const buffer = new AudioBuffer({
          numberOfChannels: 1,
          length: floats.byteLength,
          sampleRate: ac.sampleRate,
        });

        buffer.getChannelData(0).set(floats);

        const absn = new AudioBufferSourceNode(ac, { buffer });
        const msd = new MediaStreamAudioDestinationNode(ac);
        const { stream: mediaStream } = msd;
        const [track] = mediaStream.getAudioTracks();
        absn.connect(msd);
        absn.onended = e => {
          mediaElement.pause();
        }
        mediaElement.oncanplay = async e => {
          mediaElement.oncanplay = null;
          console.log(e);
          await mediaElement.play();
          await new Promise(resolve => setTimeout(resolve, 250));
          absn.start();
        }
        mediaElement.srcObject = mediaStream;

    })(chunks);

To create a discrete WAV file

      function writeString(s, a, offset) {
        for (let i = 0; i < s.length; ++i) {
          a[offset + i] = s.charCodeAt(i);
        }
      }

      function writeInt16(n, a, offset) {
        n = Math.floor(n);

        let b1 = n & 255;
        let b2 = (n >> 8) & 255;

        a[offset + 0] = b1;
        a[offset + 1] = b2;
      }

      function writeInt32(n, a, offset) {
        n = Math.floor(n);
        let b1 = n & 255;
        let b2 = (n >> 8) & 255;
        let b3 = (n >> 16) & 255;
        let b4 = (n >> 24) & 255;

        a[offset + 0] = b1;
        a[offset + 1] = b2;
        a[offset + 2] = b3;
        a[offset + 3] = b4;
      }

      // Return the bits of the float as a 32-bit integer value.  This
      // produces the raw bits; no intepretation of the value is done.
      function floatBits(f) {
        let buf = new ArrayBuffer(4);
        new Float32Array(buf)[0] = f;
        let bits = new Uint32Array(buf)[0];
        // Return as a signed integer.
        return bits | 0;
      }

      function writeAudioBuffer(audioBuffer, a, offset, asFloat) {
        let n = audioBuffer.length;
        // let n = audioBuffer.reduce((a, b) => a + b.length, 0);
        let channels = audioBuffer.numberOfChannels;
        // let channels = audioBuffer.length;

        for (let i = 0; i < n; ++i) {
          for (let k = 0; k < channels; ++k) {
            let buffer = audioBuffer.getChannelData(k);
            // let buffer = audioBuffer[k];
            if (asFloat) {
              let sample = floatBits(buffer[i]);
              writeInt32(sample, a, offset);
              offset += 4;
            } else {
              let sample = buffer[i] * 32768.0;

              // Clip samples to the limitations of 16-bit.
              // If we don't do this then we'll get nasty wrap-around distortion.
              if (sample < -32768) sample = -32768;
              if (sample > 32767) sample = 32767;

              writeInt16(sample, a, offset);
              offset += 2;
            }
          }
        }
      }

      // See http://soundfile.sapp.org/doc/WaveFormat/ and
      // http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html
      // for a quick introduction to the WAVE PCM format.
      function createWaveFileData(audioBuffer, asFloat) {
        let bytesPerSample = asFloat ? 4 : 2;
        let frameLength = audioBuffer.length; // audioBuffer[0].length
        let numberOfChannels = audioBuffer.numberOfChannels; // audioBuffer.length
        let sampleRate = audioBuffer.sampleRate; // ac.sampleRate; sampleRate
        let bitsPerSample = 8 * bytesPerSample;
        let byteRate = (sampleRate * numberOfChannels * bitsPerSample) / 8;
        let blockAlign = (numberOfChannels * bitsPerSample) / 8;
        let wavDataByteLength = frameLength * numberOfChannels * bytesPerSample;
        let headerByteLength = 44;
        let totalLength = headerByteLength + wavDataByteLength;

        let waveFileData = new Uint8Array(totalLength);

        let subChunk1Size = 16; // for linear PCM
        let subChunk2Size = wavDataByteLength;
        let chunkSize = 4 + (8 + subChunk1Size) + (8 + subChunk2Size);

        writeString('RIFF', waveFileData, 0);
        writeInt32(chunkSize, waveFileData, 4);
        writeString('WAVE', waveFileData, 8);
        writeString('fmt ', waveFileData, 12);

        writeInt32(subChunk1Size, waveFileData, 16); // SubChunk1Size (4)
        // The format tag value is 1 for integer PCM data and 3 for IEEE
        // float data.
        writeInt16(asFloat ? 3 : 1, waveFileData, 20); // AudioFormat (2)
        writeInt16(numberOfChannels, waveFileData, 22); // NumChannels (2)
        writeInt32(sampleRate, waveFileData, 24); // SampleRate (4)
        writeInt32(byteRate, waveFileData, 28); // ByteRate (4)
        writeInt16(blockAlign, waveFileData, 32); // BlockAlign (2)
        writeInt32(bitsPerSample, waveFileData, 34); // BitsPerSample (4)

        writeString('data', waveFileData, 36);
        writeInt32(subChunk2Size, waveFileData, 40); // SubChunk2Size (4)

        // Write actual audio data starting at offset 44.
        writeAudioBuffer(audioBuffer, waveFileData, 44, asFloat);

        return waveFileData;
      }

      function int16ToFloat32(inputArray) {
        const output = new Float32Array(inputArray.length);
        for (let i = 0; i < output.length; i++) {
          const int = inputArray[i];
          // If the high bit is on, then it is a negative number, and actually counts backwards.
          const float =
            int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
          output[i] = float;
        }
        return output;
      }

      const mediaElement = document.querySelector('audio');

     ;(async (chunks) => {
        const ac = new AudioContext({
          sampleRate: 22050,
          latencyHint: 1,
        });

        const uint16 = new Uint16Array(
          await new Blob(
            await Promise.all(
              chunks.map((file) => file.slice(44).arrayBuffer())
            )
          ).arrayBuffer()
        );
        const floats = int16ToFloat32(uint16);

        const buffer = new AudioBuffer({
          numberOfChannels: 1,
          length: floats.byteLength,
          sampleRate: ac.sampleRate,
        });

        buffer.getChannelData(0).set(floats);

        let wavData = createWaveFileData(buffer, false);

        let blob = new Blob([wavData], { type: 'audio/wav' });

        mediaElement.src = URL.createObjectURL(blob);

        await ac.close();
     })(chunks);
guest271314 commented 3 years ago

See also merging / layering multiple ArrayBuffers into one AudioBuffer using Web Audio API

NoBodyButMe commented 1 year ago

I often receive this error "Uncaught (in promise) RangeError: byte length of Uint16Array should be a multiple of 2 "

NoBodyButMe commented 1 year ago

Safari throws this error: Unhandled Promise Rejection: RangeError: ArrayBuffer length minus the byteOffset is not a multiple of the element size

guest271314 commented 1 year ago

I often receive this error "Uncaught (in promise) RangeError: byte length of Uint16Array should be a multiple of 2 "

Using what code?

NoBodyButMe commented 1 year ago

Thé code provided to CONVERT a blob in a WAV.

guest271314 commented 1 year ago

Try the code in this comment https://github.com/ai/audio-recorder-polyfill/issues/7#issuecomment-744066897. This is what I use to encode to WAV https://github.com/guest271314/WebCodecsOpusRecorder/blob/6b661559806751f21f5ea57bffa0e77076a37286/WebCodecsOpusRecorder.js#L256-L341.

NoBodyButMe commented 1 year ago

Thanks... but I don't think it's the easiest thing to replace the code with other code :-)

It seems that this line of code is the problem:

const uint16 = new Uint16Array( await new Blob( await Promise.all( audiochunks.map((file) => file.slice(44).arrayBuffer()) ) ).arrayBuffer() );

Sometimes (not always) I get this error: "byte length of Uint16Array should be a multiple of 2 "

guest271314 commented 1 year ago

Thanks... but I don't think it's the easiest thing to replace the code with other code :-)

Sure it is. However, if you decide to modify the code in this repository then you'll need to make sure the resulting Uint16Array is a multiple of 2.

There are a few ways to do that.

It looks like from the code multiple WAV files are created then the header is removed before merging the channel data to a single WAV.

I don't think you need to created multiple WAV files up front.

One way to achieve the requirement is to fill an Array until length is 441 * 4, set the array in a Int8Array, then set the buffer of the Int8Array to a Int16Array, then for each channel create a Float32Array to 441 length, then parse the raw PCM to floats. Then prepend the WAV header.