By Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steve Seitz, and Ira Kemelmacher-Shlizerman
This paper will be presented in IEEE CVPR 2020.
Go to Project page for additional details and results.
We recently released a brand new background matting project: better quality and REAL-TIME performance (30fps at 4K and 60fps at FHD)! You can now use this with Zoom! Much better quality! We tried this on a Linux machine with a GPU.
Acknowledgement: Andrey Ryabtsev, University of Washington
This work is licensed under the Creative Commons Attribution NonCommercial ShareAlike 4.0 License.
April 21, 2020:
April 20,2020
April 9, 2020
April 8, 2020
Clone repository:
git clone https://github.com/senguptaumd/Background-Matting.git
Please use Python 3. Create an Anaconda environment and install the dependencies. Our code is tested with Pytorch=1.1.0, Tensorflow=1.14 with cuda10.0
conda create --name back-matting python=3.6
conda activate back-matting
Make sure CUDA 10.0 is your default cuda. If your CUDA 10.0 is installed in /usr/local/cuda-10.0
, apply
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64
export PATH=$PATH:/usr/local/cuda-10.0/bin
Install PyTorch, Tensorflow (needed for segmentation) and dependencies
conda install pytorch=1.1.0 torchvision cudatoolkit=10.0 -c pytorch
pip install tensorflow-gpu==1.14.0
pip install -r requirements.txt
Note: The code is likely to work on other PyTorch and Tensorflow versions compatible with your system CUDA. If you already have a working environment with PyTorch and Tensorflow, only install dependencies with pip install -r requirements.txt
. If our code fails due to different versions, then you need to install specific CUDA, PyTorch and Tensorflow versions.
To perform Background Matting based green-screening, you need to capture:
_img.png
extension)_back.png
extension)data/background
)Use sample_data/
folder for testing and prepare your own data based on that. This data was collected with a hand-held camera.
Please download the pre-trained models from Google Drive and place Models/
folder inside Background-Matting/
.
Note: syn-comp-adobe-trainset
model was trained on the training set of the Adobe dataset. This was the model used for numerical evaluation on Adobe dataset.
Background Matting needs a segmentation mask for the subject. We use tensorflow version of Deeplabv3+.
cd Background-Matting/
git clone https://github.com/tensorflow/models.git
cd models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..
python test_segmentation_deeplab.py -i sample_data/input
You can replace Deeplabv3+ with any segmentation network of your choice. Save the segmentation results with extension _masksDL.png
.
Skip this step, if your data is captured with fixed-camera.
Run python test_pre_process.py -i sample_data/input
for pre-processing. It aligns the background image _back.png
and changes its bias-gain to match the input image _img.png
We used AKAZE features python code (since SURF and SIFT unavilable in opencv3) for alignment. We also provide an alternate MATLAB code (test_pre_process.m
), which uses SURF features. MATLAB code also provides a way to visualize feature matching and alignment. Bad alignment will produce bad matting output.
Bias-gain adjustment is turned off in the Python code due to a bug, but it is present in MATLAB code. If there are significant exposure changes between the captured image and the captured background, use bias-gain adjustment to account for that.
Feel free to write your own alignment code, choose your favorite feature detector, feature matching and alignment.
python test_background-matting_image.py -m real-hand-held -i sample_data/input/ -o sample_data/output/ -tb sample_data/background/0001.png
For images taken with fixed camera (with a tripod), choose -m real-fixed-cam
for best results. -m syn-comp-adobe
lets you use the model trained on synthetic-composite Adobe dataset, without real data (worse performance).
This is almost exactly similar as that of the image with few small changes.
To perform Background Matting based green-screening, you need to capture:
teaser.mov
)teaser_back.png
extension)target_back.mov
)We provide sample_video/
captured with hand-held camera and sample_video_fixed/
captured with fixed camera for testing. Please download the data and place both folders under Background-Matting
. Prepare your own data based on that.
cd Background-Matting/sample_video
mkdir input background
ffmpeg -i teaser.mov input/%04d_img.png -hide_banner
ffmpeg -i target_back.mov background/%04d.png -hide_banner
Repeat the same for sample_video_fixed
cd Background-Matting/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..
python test_segmentation_deeplab.py -i sample_video/input
Repeat the same for sample_video_fixed
No need to run alignment for sample_video_fixed
or videos captured with fixed-camera.
Run python test_pre_process_video.py -i sample_video/input -v_name sample_video/teaser_back.png
for pre-processing. Alternately you can also use test_pre_process_video.m
in MATLAB.
For hand-held videos, like sample_video
:
python test_background-matting_image.py -m real-hand-held -i sample_video/input/ -o sample_video/output/ -tb sample_video/background/
For fixed-camera videos, like sample_video_fixed
:
python test_background-matting_image.py -m real-fixed-cam -i sample_video_fixed/input/ -o sample_video_fixed/output/ -tb sample_video_fixed/background/ -b sample_video_fixed/teaser_back.png
To obtain the video from the output frames, run:
cd Background-Matting/sample_video
ffmpeg -r 60 -f image2 -i output/%04d_matte.png -vcodec libx264 -crf 15 -s 1280x720 -pix_fmt yuv420p teaser_matte.mp4
ffmpeg -r 60 -f image2 -i output/%04d_compose.png -vcodec libx264 -crf 15 -s 1280x720 -pix_fmt yuv420p teaser_compose.mp4
Repeat same for sample_video_fixed
For best results capture images following these guidelines:
test_data_list.txt
and train_data_list.txt
in Data_adobe
to copy only human subjects from Adobe dataset. Create folders fg_train
, fg_test
, mask_train
, mask_test
to copy foreground and alpha matte for test and train data separately. (The train test split is same as the original dataset.) You can run the following to accomplish this:
cd Data_adobe
./prepare.sh /path/to/adobe/Combined_Dataset
bg_train
and in bg_test
._comp
and the background as _back
under merged_train
and merged_test
. It will also create a CSV to be used by the training dataloader. You can pass --workers 8
to use e.g. 8 threads, though it will use only one by default.
python compose.py --fg_path fg_train --mask_path mask_train --bg_path bg_train --out_path merged_train --out_csv Adobe_train_data.csv
python compose.py --fg_path fg_test --mask_path mask_test --bg_path bg_test --out_path merged_test
Change number of GPU and required batch-size, depending on your platform. We trained the model with 512x512 input (-res
flag).
CUDA_VISIBLE_DEVICES=0,1 python train_adobe.py -n Adobe_train -bs 4 -res 512
Notes:
-res 256
, but we also recommend using lesser residual blocks. Use: -n_blocks1 5 -n_blocks2 2
.Cheers to the unofficial Deep Image Matting repo.
Please download our captured videos.. We will show next how to finetune your model on fixed-camera
captured videos. It will be similar for hand-held
cameras, except you will need to align the captured background image to each frame of the video separately. (Take a hint from test_pre_process.py
and use alignImages()
.)
Data Pre-processing:
ffmpeg -i $NAME.mp4 $NAME/%04d_img.png -hide_banner
python test_segmentation_deeplab.py -i $NAME
background
folder.Video_data_train.csv
with each row as: $image;$captured_back;$segmentation;$image+20frames;$image+2*20frames;$image+3*20frames;$image+4*20frames;$target_back
.
The process is automated by prepare_real.py
-- take a look inside and change background_path
and path
before running.Change number of GPU and required batch-size, depending on your platform. We trained the model with 512x512 input (-res
flag).
CUDA_VISIBLE_DEVICES=0,1 python train_real_fixed.py -n Real_fixed -bs 4 -res 512 -init_model Models/syn-comp-adobe-trainset/net_epoch_64.pth
We captured videos with both fixed and hand-held camera in indoor and outdoor settings. We release this data to encourage future research on improving background matting. The data is released for research purposes only.
Thanks to Andrey Ryabstev for creating Google Colab version for easy inference on images and videos of your choice.
We are eager to hear how our algorithm works on your images/videos. If the algorithm fails on your data, please feel free to share it with us at soumya91@cs.washington.edu. This will help us in improving our algorithm for future research. Also, feel free to share any cool results.
If you use this code for your research, please consider citing:
@InProceedings{BMSengupta20,
title={Background Matting: The World is Your Green Screen},
author = {Soumyadip Sengupta and Vivek Jayaram and Brian Curless and Steve Seitz and Ira Kemelmacher-Shlizerman},
booktitle={Computer Vision and Pattern Regognition (CVPR)},
year={2020}
}
Microsoft Virtual Stage: Using our background matting technology along with depth sensing with Kinect, Microsoft opensourced this amazing code for virtual staging. Follow this link for details of their technique.
Weights & Biases: Great presentation and detailed discussions and insights on pre-processing and training our model. Check out Two Minutes Paper's take on our work.