sw-gong / coma

Pytorch reproduction of the paper "Generating 3D faces using Convolutional Mesh Autoencoders (CoMA)" (ECCV 2018)
MIT License
90 stars 17 forks source link

Generating 3D faces using Convolutional Mesh Autoencoders

The repository reproduces experiments as described in the paper of "Generating 3D faces using Convolutional Mesh Autoencoders (CoMA)".

A. Ranjan, T. Bolkart, S. Sanyal, and M. J. Black. Generating 3d faces using convolutional mesh autoencoders (ECCV 2018)

This paper proposed to learn non-linear representations of a face using spectral convoultions (ChebyNet) on a mesh surface. More importantly, they introduced up- and down- sampling operations as core components of Mesh Autoencoders, enabling the model to learn hierarchical mesh representations that capture expressions at multiscales.

The results reported in the paper show a high preproducibility. Follow the same network architecture with only slightly difference w.r.t the choice of activation function and optimizing hyperparamters, our implmentation already gives better results than those shown in the paper. For instance, the results of the interpolation experiment (with 30,059 parameters) are , , compared to those reported in the paper (with 33,856 paramters) with , .

ChebyShev Convolution

Recall the chebyshev graph convolutional operator from the paper "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering":

where is a learnable parameter. The Chebyshev polynomial , and is a scaled and normalized Laplacian defined as . In our implementation, we tacitly assume which is chosen from the official ChebyConv repository.

Sampling Operation

This paper performs the in-network down- and up-sampling operations on mesh with the precomputed sampling matrices. The down-sampling matrix D is obtained from iteratively contracting vertex pairs that maintain surface error approximations using quadric matrics, and the up-sampling matrix U is obtained from including barycentric coordinates of the vertices that are discarded during the downsampling. It can be simply defined as:

where the sparse sampling matrix and node feature matrix .

The real magic of our implemtation happens in the body of models.networks.Pool. Here, we need to perform batch matrix multiplication on GPU w.r.t the sampling operation described above. Because dense matrix multiplication is really slow, we implement sparse batch matrix multiplication via scattering add node feature vectors corresponds to cluster nodes across a batch of input node feature matrices.

Installation

The code is developed using Python 3.6 on Ubuntu 16.04. The models were trained and tested with NVIDIA 2080 Ti.

Interpolation Experiment

Following the same split as described in the paper, the dataset is split in training and test samples with a ratio of 9:1. Run the script below to train and evaluet the model. The checkpoints of each epoch is saved in the corresponding output folder (specifed by the vairable exp_name). After training, it outputs the result of the "Mean Error with the Standard Deviation" as well as "Median Error", which are saved in the file euc_error.txt.

bash train_interpolation.sh

Extrapolation Experiment

To reproduce the extrapolation experiment, you should specify the test expression as described in the paper. We provide the vaiable test_exp to explictly specify the test expression. Run the script below to have a glance of the results.

bash train_extrapolation.sh

Data

To create your own dataset, you have to provide data attributes at least:

where data is inherited from torch_geometric.data.Data. Have a look at the classes of datasets.FAUST and datasets.CoMA for an example.

Alternatively, you can simply create a regular python list holding torch_geometric.data.Data objects

Citation

Please cite this paper if you use this code in your own work:

@inproceedings{ranjan2018generating,
  title={Generating 3D faces using convolutional mesh autoencoders},
  author={Ranjan, Anurag and Bolkart, Timo and Sanyal, Soubhik and Black, Michael J},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={704--720},
  year={2018}
}