[Google Research Blog] Mobile Real-time Video Segmentation

https://research.googleblog.com/2018/03/mobile-real-time-video-segmentation.html

Current GAN model has a known issue that it sometimes producing video with noticeable flickering. Although we've tried to solve it by applying moving avg. of bounding box coordinate and better tracking model, there are still rooms for improvement. The described method in this blog that achieving frame-to-frame continuity shed light on how we can tackle this problem in our future work.

This blog introduces a segmentation network for mobile application, one of its requirement/constraint is:

A video model should leverage temporal redundancy (neighboring frames look similar) and exhibit temporal consistency (neighboring results should be similar).

They achieve frame-to-frame temporal continuity by concatenating a previous (frame) mask to input channel.

we first pass the computed mask from the previous frame as a prior by concatenating it as a fourth channel to the current RGB input frame to achieve temporal consistency,

There are also data augmentation techniques for ground truth masks described in Training Procedure.

deepfakes / faceswap-model

[Google Research Blog] Mobile Real-time Video Segmentation #16