Closed benbro closed 2 years ago
Thank you for the excellent repo! Your post-processing pipeline is spectacular!
I've also been interested in using the MediaPipe sefie segmentation model. I forked this project and tried the following
SegmentationConfig.inputResolution
as well as InputResolution
--I tried both 144x256
(as noted in the docs) and 256x144
just to be sureIn all cases, the output video feed is blurry.
When I saw that the output image is completely blurred, I wanted to check that the generated segmentation mask contained non-zero values. After checking tflite.HEAPF32
just before it gets passed to gl.texSubImage2D
in loadSegmentationStage::render
, I was able to confirm that the provided segmentation mask does include non-zero values.
With this in mind, I'm uncertain what else needs to be done get this new model working correctly within this repo. Any input is appreciated.
Thank you!
Maybe we shouldn't resize the input when using this model? The selfie_segmentation model docs says:
Segmentation automatically resizes the input image to the desired tensor dimension before feeding it into the ML models.
Can you please share your code?
Unfortunately, using the camera's native resolution as inputResolution
(in my case 640x480) is not working.
Here's the code (note it's in the media-pipe
branch of my fork). My fork strips out react and all non-background blur code.
The files you'll be interested in are
TFLite.ts
: Imports the MediaPipe landscape modelblur.ts
: Sets inputResolution
I also tried replacing part of the pipeline with part of Media Pipe's sample segmentation app just so I could get the segmentation mask directly from their code. I then fed the generated mask into the pipeline in this repo, but I didn't have any luck there either.
Hi @jpodwys, thank you for experimenting with this model. Just an intuition without investigating the model file, have you tried replacing this line by calling buildLoadSegmentationStage
instead of buildSoftmaxStage
? I'm wondering if the softmax is part of the model already.
Thank you for the suggestion, you were right!
So the following works
inputResolution
to 256x144
(not 144x256
as noted in the docs)webgl2Pipeline
's call to buildSoftmaxStage
with buildLoadSegmentationStage
The face of a happy dev :)
Hi @jpodwys, did you check how does the landscape model performance differ from ML Kit?
My fork uses the landscape MediaPipe model so you can compare this repo's live demo to my fork's live demo.
I also made a live demo that skips the post-processing pipeline and blurs directly via canvas. It has better performance (fps) but less impressive post-processing (there's a halo effect around humans). That said, the canvas-only blur implementation looks surprisingly good in this demo because, as Volcomix pointed out, the MediaPipe team built softmax directly into their model.
Google released a new landscape model of the selfie segmentation: https://google.github.io/mediapipe/solutions/selfie_segmentation https://github.com/google/mediapipe/tree/master/mediapipe/modules/selfie_segmentation
How does this compares to the mlkit model in this project? Does selfie_segmentation.js use WebAssembly+SIMD or WebGL2? Should we use @mediapipe/selfie_segmentation/selfie_segmentation.js or is the model in this project more performant?