Background Concealment (Blur/Replacement).

Why ?

Google Meet, Microsoft Teams, Zoom and every video-conferencing application these days has the Background Concealment / (Blur, Replacement) so that users can minimize distractions and keep the focus on the subject. Most web based apps will use some form of AI inference to implement this feature, like say Jitsi is using Meet’s model and TensorflowLite’s WASM backend [commit].

The popularity on the native side and also usage of NN frameworks to implement on the Web platform warrants a discussion if it makes sense bring this feature to the Web Platform (WebRTC) in a shape which might benefit all without bringing their own frameworks and leveraging the underlying platform support, which in many cases might be accelerated via VPUs or other ASIC processors.

How ?

MediaFoundation has added the support for background segmentation using properties like KSCAMERA_EXTENDEDPROP_BACKGROUNDSEGMENTATION_BLUR from Windows 11, if there is support from the underlying driver. By encapsulating the preferably (ASIC) hardware accelerated inference work in the driver and leveraging standard platform APIs, we do not have to re-invent the wheel for every web application.

Apple's Segmentation Matte in Portrait Mode captures are essentially a general framework to implement many new features, one of which can be Background Replacement.

Opens

Do we try to compose a single API for Background Blur (BB) and Background Replacement (BR) ? Blur level; // [0,1] enum media_type ; // [image, video, 3d_animation]

w3c / mediacapture-extensions

Background Concealment (Blur/Replacement). #45

Why ?

How ?