choyingw / SynergyNet

3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
MIT License
383 stars 57 forks source link
2d-3d 3d 3d-face-alignment 3d-face-reconstruction 3dv2021 3dvision computer-graphics computer-vision deep-neural-networks facial-keypoints facial-landmarks head-pose-estimation pytorch

SynergyNet

3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at University of Souther California

PWC PWC PWC

[paper] [video] [project page]

News [Jul 10, 2022]: Add simplified api for getting 3d landmarks, face mesh, and face pose in only one line. See "Simplified API" It's convenient if you simply want to plug in this method in your work.

News: Add Colab demo Open In Colab

News: Our new work [Cross-Modal Perceptionist] is accepted to CVPR 2022, which is based on this SynergyNet project.

Advantages

:+1: SOTA on all 3D facial alignment, face orientation estimation, and 3D face modeling.

:+1: Fast inference with 3000fps on a laptop RTX 2080.

:+1: Simple implementation with only widely used operations.

(This project is built/tested on Python 3.8 and PyTorch 1.9 on a compatible GPU)

Single Image Inference Demo

  1. Clone

    git clone https://github.com/choyingw/SynergyNet

    cd SynergyNet

  2. Use conda

    conda create --name SynergyNet

    conda activate SynergyNet

  3. Install pre-requisite common packages

    PyTorch 1.9 (should also be compatiable with 1.0+ versions), Torchvision, Opencv, Scipy, Matplotlib, Cython

  4. Download data [here] and [here]. Extract these data under the repo root.

These data are processed from [3DDFA] and [FSA-Net].

Download pretrained weights [here]. Put the model under 'pretrained/'

  1. Compile Sim3DR and FaceBoxes:

    cd Sim3DR

    ./build_sim3dr.sh

    cd ../FaceBoxes

    ./build_cpu_nms.sh

    cd ..

  2. Inference

    python singleImage.py -f img

The default inference requires a compatible GPU to run. If you would like to run on a CPU, please comment the .cuda() and load the pretrained weights into cpu.

Simplified API

We provide a simple API for convenient usage if you want to plug in this method into your work.

import cv2
from synergy3DMM import SynergyNet
model = SynergyNet()
I = cv2.imread(<your image path>)
# get landmark [[y, x, z], 68 (points)], mesh [[y, x, z], 53215 (points)], and face pose (Euler angles [yaw, pitch, roll] and translation [y, x, z])
lmk3d, mesh, pose = model.get_all_outputs(I)

We provide a simple script in singleImage_simple.py

We also provide a setup.py file. Run pip install -e . You can do from synergy3DMM import SynergyNet in other directory. Note that [3dmm_data] and [pretrained weight] (Put the model under 'pretrained/') need to be present.

Benchmark Evaluation

  1. Follow Single Image Inference Demo: Step 1-4

  2. Benchmarking

    python benchmark.py -w pretrained/best.pth.tar

Print-out results and visualization fo first-50 examples are stored under 'results/' (see 'demo/' for some pre-generated samples as references) are shown.

Updates: Best head pose estimation [pretrained model] (Mean MAE: 3.31) that is better than number reported in paper (3.35). Use -w to load different pretrained models.

Training

  1. Follow Single Image Inference Demo: Step 1-4.

  2. Download training data from [3DDFA]: train_aug_120x120.zip and extract the zip file under the root folder (Containing about 680K images).

  3. bash train_script.sh

  4. Please refer to train_script for hyperparameters, such as learning rate, epochs, or GPU device. The default settings take ~19G on a 3090 GPU and about 6 hours for training. If your GPU is less than this size, please decrease the batch size and learning rate proportionally.

Textured Artistic Face Meshes

  1. Follow Single Image Inference Demo: Step 1-5.

  2. Download artistic faces data [here], which are from [AF-Dataset]. Download our predicted UV maps [here] by UV-texture GAN. Extract them under the root folder.

  3. python artistic.py -f art-all --png(whole folder)

    python artistic.py -f art-all/122.png(single image)

Note that this artistic face dataset contains many different level/style face abstration. If a testing image is close to real, the result is much better than those of highly abstract samples.

Textured Real Face Renderings

  1. Follow Single Image Inference Demo: Step 1-5.

  2. Download our predicted UV maps and real face images for AFLW2000-3D [here] by UV-texture GAN. Extract them under the root folder.

  3. python uv_texture_realFaces.py -f texture_data/real --png (whole folder)

    python uv_texture_realFaces.py -f texture_data/real/image00002_real_A.png (single image)

The results (3D meshes and renderings) are stored under 'inference_output'

More Results

We show a comparison with [DECA] using the top-3 largest roll angle samples in AFLW2000-3D.

Facial alignemnt on AFLW2000-3D (NME of facial landmarks):

Face orientation estimation on AFLW2000-3D (MAE of Euler angles):

Results on artistic faces:

Related Project

[Cross-Modal Perceptionist] (analysis on relation for voice and 3D face)

Bibtex

If you find our work useful, please consider to cite our work

@INPROCEEDINGS{wu2021synergy,
  author={Wu, Cho-Ying and Xu, Qiangeng and Neumann, Ulrich},
  booktitle={2021 International Conference on 3D Vision (3DV)}, 
  title={Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry}, 
  year={2021}
  }

Acknowledgement

The project is developed on [3DDFA] and [FSA-Net]. Thank them for their wonderful work. Thank [3DDFA-V2] for the face detector and rendering codes.