Global-Flow-Local-Attention

The source code for our paper "Deep Image Spatial Transformation for Person Image Generation" (CVPR2020)

We propose a Global-Flow Local-Attention Model for deep image spatial transformation. Our model can be ﬂexibly applied to tasks such as:

Pose-Guided Person Image Generation:

Left: generated results of our model; Right: Input source images.

Pose-Guided Person Image Animation

Left most: Skeleton Squences. The others: Animation Results.

Face Image Animation

Left: Input image; Right: Output results.

View Synthesis

Form Left to Right: Input image, Results of Appearance Flow, Results of Ours, Ground-truth images.

News

2020.4.30 Several demos are provided for quick exploration.
2020.4.29 Code for Pose-Guided Person Image Animation is avaliable now!
2020.3.15 We upload the code and trained models of the Face Animation and View Synthesis!
2020.3.3 Project Website and Paper are avaliable!
2020.2.29 Code for PyTorch is available now!

Colab Demo

For a quick exploration of our model, find the online colab demo.

Get Start

1) Installation

Requirements

Python 3
pytorch (1.0.0)
CUDA
visdom

Conda installation

# 1. Create a conda virtual environment.
conda create -n gfla python=3.6 -y
source activate gfla

# 2. Install dependency
pip install -r requirement.txt

# 3. Build pytorch Custom CUDA Extensions
./setup.sh

Note: The current code is tested with Tesla V100. If you use a different GPU, you may need to select correct nvcc_args for your GPU when you buil Custom CUDA Extensions. Comment or Uncomment --gencode in block_extractor/setup.py, local_attn_reshape/setup.py, and resample2d_package/setup.py. Please check here for details.

2) Download Resources

We provide the pre-trained weights of our model. The resources are listed as following:

Pose-Guided Person Image Generation

Google Drive: Fashion | Market

OneDrive: Fashion | Market
Pose Guided Person Image Animation

Google Drive: FashionVideo | iPER

OneDrive: FashionVideo | iPER
Face Image Animation

Google Drive: Face Animation

OneDrive: Face_Animation
Novel View Synthesis

Google Drive: ShapeNet Car | ShapeNet Chair

OneDrive: ShapeNet_Car | ShapeNet_Chair

Download the Per-Trained Models and the Demo Images by running the following code:

./download.sh

3) Pose-Guided Person Image Generation

The Pose-Guided Person Image Generation task is to transfer a source person image to a target pose.

Run the demo of this task:

python demo.py \
--name=pose_fashion_checkpoints \
--model=pose \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=fashion \
--dataroot=./dataset/fashion \
--results_dir=./demo_results/fashion

For more training and testing details, please find the PERSON_IMAGE_GENERATION.md

4) Pose-Guided Person Image Animation

The Pose-Guided Person Image Animation task generates a video clip from a still source image according to a driving target sequence. We further model the temporal consistency for this task.

Run the the demo of this task:

python demo.py \
--name=dance_fashion_checkpoints \
--model=dance \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=dance \
--sub_dataset=fashion \
--dataroot=./dataset/danceFashion \
--results_dir=./demo_results/dance_fashion \
--test_list=val_list.csv

For more training and testing details, please find the PERSON_IMAGE_ANIMATION.md.

5) Face Image Animation

Given an input source image and a guidance video sequence depicting the structure movements, our model generating a video containing the speciﬁc movements.

Run the the demo of this task:

python demo.py \
--name=face_checkpoints \
--model=face \
--attn_layer=2,3 \
--kernel_size=2=5,3=3 \
--gpu_id=0 \
--dataset_mode=face \
--dataroot=./dataset/FaceForensics \
--results_dir=./demo_results/face

We use the real video of the FaceForensics dataset. See FACE_IMAGE_ANIMATION.md for more details.

6) Novel View Synthesis

View synthesis requires generating novel views of objects or scenes based on arbitrary input views.

In this task, we use the car and chair categories of the ShapeNet dataset. See VIEW_SYNTHESIS.md for more details.

Citation

@article{ren2020deep,
  title={Deep Image Spatial Transformation for Person Image Generation},
  author={Ren, Yurui and Yu, Xiaoming and Chen, Junming and Li, Thomas H and Li, Ge},
  journal={arXiv preprint arXiv:2003.00696},
  year={2020}
}

@article{ren2020deep,
  title={Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation},
  author={Ren, Yurui and Li, Ge and Liu, Shan and Li, Thomas H},
  journal={IEEE Transactions on Image Processing},
  year={2020},
  publisher={IEEE}
}

Acknowledgement

We build our project base on Vid2Vid. Some dataset preprocessing methods are derived from Pose-Transfer.

garyfeng / Global-Flow-Local-Attention

readme