Recording a <canvas> and audio stream

rikschennink commented 1 year ago

Love this! I'm still new to video processing so I'm not sure if this is possible.

My goal is to apply filters, trim, and draw on top of a video.

I have a <video> element as source (that has an audio track).

By updating the currentTime and listening to "seeked" I've successfully managed to record video frames for a section a given video (for example timestamp 2000 to 3500). This works perfectly and is a lot faster than using the MediaRecorder.

Now I also want to add the correct section of the AudioTrack and that's where I'm kind of lost?

I've tried to use the method in this issue and in the canvas drawing demo but it doesn't seem to work. The WritableStream write function gets called but the chunks in the AudioEncoder output have a byteLenght of only 3 which seems incorrect.

If you could give me a pointer in the right direction that would be amazing.

Also, happy to support this project, so if you have a donation link, please let me know. 🙏

Vanilagy commented 1 year ago

Thank you so much! I have a Ko-fi! https://ko-fi.com/vanilagy

I'll look into your issue later, I'm quite busy right now.

rikschennink commented 1 year ago

Perfectly fine, take your time 😊 Just sent a coffee shipment 📦

Vanilagy commented 1 year ago

Thank you so much for your generosity!

Alright so, I understand your use case - sounds like you want to build some sort of video cropper / editor. I think a good solution for the audio is using AudioContext.prototype.decodeAudioData on the raw video bytes to extract the audio (that works!) I wrote up some code to test it locally:

let response = await fetch('test.mp4'); // Get your video from wherever
let videoArrayBuffer = await response.arrayBuffer();
let audioBuffer = await audioContext.decodeAudioData(videoArrayBuffer);

// This was so I could hear the sound working
let node = audioContext.createBufferSource();
node.buffer = audioBuffer;
node.start();
node.connect(audioContext.destination);

// Define your "trimming extents"
let startTime = 3;
let endTime = 5.5;
let duration = endTime - startTime;

// You should ensure that startTime * sampleRate and endTime * sampleRate are integers, like:
// startTime = Math.round(startTime * sampleRate) / sampleRate

// Now we want to create an array packed with all of the data
let data = new Float32Array(duration * audioBuffer.sampleRate * audioBuffer.numberOfChannels);

for (let i = 0; i < audioBuffer.numberOfChannels; i++) {
    data.set(
        audioBuffer.getChannelData(i).subarray(startTime * audioBuffer.sampleRate, endTime * audioBuffer.sampleRate),
        i * duration * audioBuffer.sampleRate
    );
}

let audioData = new AudioData({
    format: 'f32-planar',
    sampleRate: audioBuffer.sampleRate,
    numberOfChannels: audioBuffer.numberOfChannels,
    numberOfFrames: data.length,
    timestamp: 0,
    data
});

// Pass audioData into audioEncoder.encode

See if this gets you somewhere! Gotta see if you get a "clicking" sound when the video ends or not (that usually happens when you randomly cut off an audio buffer). Although there's a good chance the compression codec will smooth that out. If it doesn't, you can try fading in the audio in the first 0.01 seconds or so (and fading out again) by multiplying the numbers in the Float32Array.

rikschennink commented 1 year ago

Thanks! The audio is playing, so the first part is working. My test mp4 has 6 channels, and a sampleRate of 48000.

I copy pasted your code, and for testing set the start and end to 0 and the total video duration.

TypeError: Failed to construct 'AudioData': data is too small: needs 870580224 bytes, received 145096704.
    at http://localhost:3000/pintura-video/pinturavideo.js:9567:27
    at async process (http://localhost:3000/pintura/module/pintura.js:7278:21)
    at async http://localhost:3000/pintura/module/pintura.js:7489:33

As 870580224 is 6 times 145096704 I suspect somehow the data.set is not adding enough data?

Vanilagy commented 1 year ago

Aah no, I think I set numberOfFrames wrong. I think it's frames per channel. So it should be duration * sampleRate.

rikschennink commented 1 year ago

Alright, yeah that works! Alright, back to it, will let you know if it works :)

rikschennink commented 1 year ago

I'm very much out of my comfort zone with this 😅


const audioData = new AudioData({
    format: 'f32-planar',
    sampleRate: sampleRate,
    numberOfChannels: numberOfChannels,
    numberOfFrames: totalDuration * sampleRate,
    timestamp: 0,
    data,
});

audioEncoder = new AudioEncoder({
    output: (chunk, meta) => muxer.addAudioChunk(chunk, meta),
    error: (err) => {
        throw err;
    },
});

audioEncoder.configure({
    codec: 'opus',
    numberOfChannels: 2,
    sampleRate: sampleRate
});

audioEncoder.encode(audioData);

It throws

Input audio buffer is incompatible with codec parameters

I had to set number of channels to 2 on the encoder, it didn't accept 6.

rikschennink commented 1 year ago

This is the muxer:

const muxer = new Muxer({
    target: new ArrayBufferTarget(),
    video: {
        codec: 'V_VP9',
        width,
        height,
        frameRate,
    },
    audio: {
                    codec: 'A_OPUS',
                    sampleRate: sampleRate,
                    numberOfChannels: 2
    },
    firstTimestampBehavior: 'offset',
});

Vanilagy commented 1 year ago

Well, your audio data has 6 channels tho! So you'll probably have to reduce it to stereo, picking two channels that you wanna keep. Is keeping the first 2 channels sufficient? (Should be front left and front right)

rikschennink commented 1 year ago

@Vanilagy Ooh I was expecting it to know how to limit that? Like merge all the channels.

Trying with 2 channels now :)

Truly appreciate your help.

rikschennink commented 1 year ago

It works 🎉 This is awesome, thank you so much.

Vanilagy commented 1 year ago

Awesome!! Thanks for the coffee shipment :)

Vanilagy / webm-muxer

Recording a <canvas> and audio stream #17