floe / backscrub

Virtual Video Device for Background Replacement with Deep Semantic Segmentation
Apache License 2.0
734 stars 85 forks source link

Process parts of image individually / pad image to model aspect #33

Open BenBE opened 3 years ago

BenBE commented 3 years ago

To avoid clipping the image partway it would be nice if the image may be split into two (or more) overlapping areas that are fed to the NN and recombined after detection (by e.g. ORing the results together). This is overall a bit slower but would allow for arbitrary aspect rations to be handled. This might also allow for feeding a scaled image into the NN and refining the result area by area.

floe commented 3 years ago

Hmmm, interesting idea. That might work with deeplab, worth a try. For the Google Meet model, I think it's mostly trained on images showing the full portrait in the center of the frame, so splitting the person across "tile" boundaries might not work at all...

BenBE commented 3 years ago

For the Google Meet model padding the image on top/bottom might be worth a try (resizing the image if necessary, even though slower that way).

floe commented 3 years ago

Good point, this should be easy to try.