schardong / ifmorph

Implementation of "Neural Implicit Morphing of Face Images", published on CVPR 2024
https://schardong.github.io/ifmorph/
MIT License
13 stars 1 forks source link
faces implicit-neural-representation morphing thin-plate

Neural Implicit Morphing of Face Images

Guilherme Schardong [1], Tiago Novello [2], Hallison Paz [2] Iurii Medvedev [1] Vinícius da Silva [3], Luiz Velho [2] Nuno Gonçalves [1,4]
[1] Institute of Systems and Robotics, University of Coimbra (UC)
[2] Institute for Pure and Applied Mathematics (IMPA),
[3] Pontifical Catholic University of Rio de Janeiro (PUC-Rio),
[4] Portuguese Mint and Official Printing Office (INCM),

This is the official implementation of "Neural Implicit Morphing of Face Images", published on the Proceedings of CVPR 2024, also on arXiv. More results and examples in the project page.

Overview of our method

Getting started

TL-DR: If you just want to run the code, follow the steps below (assuming a UNIX system with Make installed). For more details, jump to Setup and sample run section.

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
make data/frll_neutral_front
python mark-warp-points.py --landmark_detector dlib --output experiments/001_002.yaml data/frll_neutral_front/001_03.jpg data/frll_neutral_front/002_03.jpg
python warp-train.py --no-ui experiments/001_002.yaml

Prerequisites

  1. Python venv, or Anaconda, alternativelly you can use PyEnv and PyEnv-VirtualEnv or other environment management mechanism
  2. Git
  3. Integrate Git Bash with conda (If on Windows)

Code organization

Most of the functions are available through the ifmorph module. It contains the following files:

On the repository root, we stored most of the scripts needed to reproduce our work for general face images. We list them below for completeness:

Inside the standalone folder, we've stored scripts used for experiments in our paper, mainly the metrics (FID and LPIPS), alignment and, landmark detection. These are:

Setup and sample run

For this setup, we assume that the Python version is >= 3.10.0 and CUDA Toolkit is 11.6. We also tested with Python 3.9.0 and CUDA 11.7 everything worked as well. Note that we assume that all commands are typed in the root of the repository, unless stated otherwise. Note that we tested these steps only on Ubuntu 22.04. For different Python/CUDA version, you may need to tweak the package versions, especially MediaPipe and nvidia related ones.

(Optional) After cloning the repository, issue a git submodule init followed by git submodule update command on a terminal, to download the mrimg submodule.

venv

Using Python's venv is the simplest option, as it involves no addons other than a functioning Python installation. Simply open a terminal, navigate to the repository root and type the following commands:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .

PyEnv

First, you need to have both PyEnv and PyEnv-virtualenv installed. Afterwards, just type the following commands in a terminal.

pyenv virtualenv 3.10.0 ifmorph
pyenv local ifmorph
pip install -r requirements.txt
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cu116
pip install -e .

Conda

For convenience, we provide an environment_${SYSTEM}.yml file with our environment configuration, where ${SYSTEM}is either ubuntu or windows. To create an environment from this file, just type the following commands in a terminal (replace ${SYSTEM} for either ubuntu or windows).

conda env create -f environment_${SYSTEM}.yml
conda activate ifmorph
pip install -e .

If you prefer to create an environment from scratch, the following commands may be used instead. Note that this is a suggestion, you may customize names and versions to suit your needs.

conda create --name ifmorph
conda activate ifmorph
conda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
pip install -r requirements.txt
pip install -e .

Dataset

Download the Face Research Lab London dataset from their website. If you use makefiles, we provide a rule to download and extract the dataset to the correct location (see the Makefile, rule data/frll_neutral_front). Any image may be used, as long as it contains a face. Optionally, you may create an implicit representation by running the create-initial-states.py script (an example is provided below).

An optional pre-processing step is implemented in a modified version of the align.py script, provided by the DiffAE authors (which they extracted from the FFHQ pre-processing script) to crop and resize the images. We modified it to allow for a "non-alignment", thus the images are cropped and resized, but not necessarilly aligned. For the quantitative comparisons, the images need to be pre-processed by the align.py script, since the other models assume the face to occupy a central (and large) part of the image.

How to run

  1. (Optional) Crop/resize the images
  2. (Optional) Create the initial neural implicit representation of your target images (note that wildcards are accepted)
  3. Create the face landmarks manually ou automatically
  4. Run the warp training script

In an Ubuntu 22.04 system, the commands below should do it. Note that the optional parameters have default values, thus you don't need to specify them. We do it here for some of them to demonstrate possible values:

(OPTIONAL) python align.py --just-crop --output-size 1024 --n-tasks 4 data/frll/ data/frll_neutral_front_cropped
(OPTIONAL) python create-initial-states.py --nsteps 1000 --device cuda:0 experiments/initial_state_rgb_large_im.yaml data/frll_neutral_front/001_03.jpg data/frll_neutral_front/002_03.jpg
python mark-warp-points.py data/frll_neutral_front/001_03.jpg data/frll_neutral_front/002_03.jpg
python warp-train.py experiments/001_002-baseline.yaml

Additionally, we provide a Makefile with the source code, that trains the initial states (or downloads them from here or here if you want the cropped images) and runs the example warping experiment.

Reproducing the paper's experiments

To avoid cluttering the repository, we've opted to store the experiment configuration files externally. You can generate them after following the Setup procedure above. For convenience, you can download the experiment files from here and pretrained image networks from here. Also, see the Makefile for rules that automate this process. However, we describe the steps to recreate those files below. We always assume that the commands are typed from the repository root. Additionally, we assume that the python environment is activated. First, you must crop and resize the FRLL face images, afterwards you must create the neural initial states. You can do so by typing:

python standalone/align.py --just-crop --output-size 1350 --n-tasks 4 data/frll_neutral_front/ data/frll_neutral_front_cropped
python standalone/create-initial-states.py --nsteps 5000 --device cuda:0 --output_path pretrained/frll_neutral_front_cropped experiments/initial_state_rgb_large_im.yaml data/frll_neutral_front_cropped/*.png

Note that data/frll_neutral_front_cropped is not in the repository as well. You can download the original images from the FRLL repository (see our Makefile) and crop them using the first command above. This will store all images in data/frll_neutral_front_cropped in the pretrained/frll_neutral_front_croppped folder. Afterwards, run the script to detect the landmarks, followed by the script to create the experiment configuration files:

python standalone/detect-landmarks.py -o pretrained/frll_neutral_front_cropped/ pretrained/frll_neutral_front_cropped/*.pth
python standalone/create-experiment-files.py data/pairs_for_morphing_full.txt pretrained/frll_neutral_front_cropped/ experiments/pairwise_dlib

Finally, you can simply train the warpings by issuing the following command:

python warp-train.py experiments/pairwise_dlib/*.yaml --no-ui --logging none --device cuda:0 --output-dir results/pairwise_dlib --skip-finished --no-reconstruction

The above command will run the trainings sequentially. However, if you may notice that your GPU is underutilized. In this case, you may run the warp-train.py script setting the --n-tasks parameter to any value larger than 1, as shown below. In this case, it will run multiple trainings in parallel in the same GPU. Tweak --n-tasks accordingly.

python warp-train.py experiments/pairwise_dlib/*.yaml --no-ui --logging none --device cuda:0 --output-path results/pairwise_dlib --n-tasks 6 --skip-finished --no-reconstruction

Contributing

Any contribution is welcome. If you spot an error or have questions, open issues and we will answer as soon as possible.

Citation

If you find our work useful in your research, consider citing it in your tech report or paper.

@InProceedings{Schardong_2024_CVPR,
    author    = {Schardong, Guilherme and Novello, Tiago and Paz, Hallison and Medvedev, Iurii and da Silva, Vin{\'\i}cius and Velho, Luiz and Gon\c{c}alves, Nuno},
    title     = {Neural Implicit Morphing of Face Images},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {7321-7330}
}