This repository contains the source code for the papers ULTRA-LOW BITRATE VIDEO CONFERENCING USING DEEP IMAGE ANIMATION, A HYBRID DEEP ANIMATION CODEC FOR LOW-BITRATE VIDEO CONFERENCING and PREDICTIVE CODING FOR ANIMATION-BASED VIDEO COMPRESSION
We support python3
. To install the dependencies run:
pip install -r requirements.txt
Describes the configuration settings for for training and testing the models.
See config/dac.yaml
, config/hdac.yaml
,config/rdac.yaml
.
Use --mode test
at inference time with the same config file after changing the eval_params
appropriately.
**VoxCeleb**. Please follow the instruction from https://github.com/AliaksandrSiarohin/video-preprocessing.
**Creating your own videos**.
The input videos should be cropped to target the speaker's face at a resolution of 256x256 (Updates are underway to add higher resolution).
**Pre-processed videos (256x256 px)**
We provide preprocessed videos at the following link: [google-drive](https://drive.google.com/drive/folders/1g0U1ZCTszm3yrmIewg7FahXsxyMBfxKj?usp=sharing)
Download put the videos in ```datasets/train``` and ```datasets/inference``` folders.
Checkpoints can be found under following link: google-drive. Download and place in the checkpoints/
directory.
We include a metrics module combining the suggestions from JPEG-AI with popular quantiative metrics used in computer vision and beyond. Supported metrics: 'psnr', 'psnr-hvs','fsim','iw_ssim','ms_ssim','vif','nlpd', 'vmaf','lpips'
Set the config/[MODEL_NAME].yaml
parameters appropriately or use default (to reproduce our results) and run bash script_training.sh [MODEL_NAME]
.
The default setup uses a single GPU (NVIDIA-A100). However, training DAC, HDAC and RDAC can be trained on multiple GPUs by using distributed dataparallel and setting --device_ids
parameter as desired.
NOTE: baselines.yaml
is used for HEVC, VVC and VVENC.
Download the HEVC, VVC(VTM-12) from google-drive and put them conventional_codecs/
folder.
Set the eval_params
on the config/[MODEL_NAME].yaml
file and run bash script_test.sh [MODEL_NAME]
.
This code base contains source code from the following works: