Volcomix / virtual-background

Demo on adding virtual background to a live video stream in the browser
https://volcomix.github.io/virtual-background
Apache License 2.0
474 stars 122 forks source link

Reduce jitter #63

Open benbro opened 10 months ago

benbro commented 10 months ago

There is a jitter when using segmentation with a video or camera. Select one of the three videos in the demo and notice that the mask isn't stable.

Is there a way to reduce the jitter? Maybe by averaging the mask of consequent frames? I've found this issue and this issue discussing this in mediapipe.

When I upload the Dance - 32938.mp4 video from the demo to the selfie_segmentation demo it looks much better.

Is there a way to improve the segmentation or should we replace the segmentation step with the mediapipe solution?

edamlmmv commented 7 months ago

The multi-segmenter model seems to be more stable than MLKit or MEET. (See: https://codepen.io/edamlmmv/pen/PoLYmGG)

You can see an example in this codepen demo where I try use a 15x15 blur kernel to apply gaussian blur : https://codepen.io/edamlmmv/details/KKEdJod (I broke it oops, gonna check that later) it works but the effect is not as good as Volcomix's blur.

I've tried implementing backgroundBlurStage into my demo, here is my attempt : https://codepen.io/edamlmmv/pen/KKEdbbY There are some issues. Firstly, I don't apply vertical pass and horizontal pass like Volcomix. I've tried but I wasn't successful. I've also tried to implement multiple blur passes but I cannot seem to make it work. Finally, contrary to Volcomix, my mask doesn't seem to have an opacity value (maskTex.a). Instead, there's an RGB color for the segmented individual and another color for the background.

My understanding of WebGL is limited and I was wondering if you could give me some cue as to how to advance my implementation.

I've tried using the model in Volcomix's demo but the buffer overflow and I cannot build the TFLite tool. Also, Google ImageSegmenter now uses WebGPU and it seems on-part or faster than using our own WASM functions.

edamlmmv commented 7 months ago

Rectification, it's almost working. Only the segmentation mask is missing. Everything is blurred now. However, the segmentation mask seems to include all the pixels, the foreground and background where the background is colored red and the foreground black. This is an issue because I would expect to be no pixels where the red pixels are. image

https://codepen.io/edamlmmv/pen/xxBKqma

To bind the mask use: gl.activeTexture(gl.TEXTURE1); gl.bindTexture(gl.TEXTURE_2D, mask_to_bind);

I think it's because I am missing the loadSegmentationStage