sibozhang / vid2vid

A modified version of vid2vid for Speech2Video, Text2Video Paper
https://github.com/sibozhang/Speech2Video
35 stars 11 forks source link
gan speech2video text2video vid2vid video

vid2vid

modifed version of vid2vid for Speech2Video and Text2Video.

Setup

  1. git clone

    git clone git@github.com:sibozhang/vid2vid.git
  2. setup env torchvision need to be 0.2.2 to be compatible with torch 0.4.1

    python3 -m venv ../venv/vid2vid
    source ../venv/vid2vid/bin/activate
    pip install --upgrade pip
    pip3 install https://download.pytorch.org/whl/cu92/torch-0.4.1-cp36-cp36m-linux_x86_64.whl 
    pip install torchvision==0.2.2 
    pip install numpy
    pip install dominate requests
    pip install pillow
    pip install opencv-python 
    pip install scipy 
    pip install pytz

Trained model

Please build 'checkpoints' folder in the current folder and put trained model in it.

VidTIMIT fadg0 (English, Female) Dropbox

百度云链接:https://pan.baidu.com/s/1L1cvqwLu_uqN2cbW-bDgdA 密码:hygt

Xuesong (Chinese, Male) Dropbox

百度云链接:https://pan.baidu.com/s/1lhYRakZLnkQ8nqMuLJt_dA 密码:40ob

Q&A

  1. Get vid2vid working
    *cd vid2vid/models/flownet2_pytorch
    *export CUDA_HOME=/tools/cuda-9.2.88/
    *comment “--user” in /flownet2_pytorch/install.sh //so it will install to python under venv, otherwise it install to .local
    *bash install.sh
  2. Q: File "/mnt/scratch/sibo/vid2vid/util/util.py", line 62, in tensor2im image_numpy = image_tensor.cpu().float().numpy() RuntimeError: PyTorch was compiled without NumPy support

A: pip install torch==0.4.1.post2

Citation

Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses

Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang. PDF Result Video 1 min Spotlight 10 min Presentation

@inproceedings{liao2020speech2video,
  title={Speech2video synthesis with 3D skeleton regularization and expressive body poses},
  author={Liao, Miao and Zhang, Sibo and Wang, Peng and Zhu, Hao and Zuo, Xinxin and Yang, Ruigang},
  booktitle={Proceedings of the Asian Conference on Computer Vision},
  year={2020}
}

Ackowledgements

This code is based on the vid2vid framework.