Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars.
EMOPortraits introduces a novel approach for generating realistic and expressive one-shot head avatars driven by multimodal inputs, including extreme and asymmetric emotions.
For more details, please refer to:
You can set up the environment using the provided conda-pack
archive:
conda-pack
ArchiveDownload the Environment Archive
Download sav.tar.gz
from the Google Drive or Yandex Disk.
Unpack the Environment
# Create a directory for the environment
mkdir -p sav_env
# Unpack the tar.gz archive into the directory
tar -xzf sav.tar.gz -C sav_env
Using Python Without Activating
# Run Python directly from the unpacked environment
./sav_env/bin/python
Activating the Environment
# Activate the environment
source sav_env/bin/activate
Once activated, you can run Python as usual:
(sav_env) $ python
Cleanup Prefixes
After activating the environment, you may need to run the following command to fix any issues with environment paths:
(sav_env) $ conda-unpack
This command can also be run without activating the environment, as long as Python is installed on the machine.
environment.yml
Note: This option may not work as it has not been thoroughly tested.
Due to limitations with conda-pack
, the following repositories need to be installed manually:
Face Detection: Install from GitHub
git clone https://github.com/hhj1897/face_detection.git
cd face_detection
pip install -e .
ROI Tanh Warping: Install from GitHub
git clone https://github.com/ibug-group/roi_tanh_warping.git
cd roi_tanh_warping
pip install -e .
Face Parsing: Install from GitHub
git clone https://github.com/hhj1897/face_parsing.git
cd face_parsing
pip install -e .
Download Required Files
Please download the following files from Google Drive or Yandex Disk:
logs.zip
(contains main model weights - not yet available)logs_s2.zip
(contains stage 2 model weights)repos.zip
(contains dependencies repos and it's weights)Extract Files
Extract all the downloaded zip files into the root directory of the project:
unzip logs.zip -d ./
unzip logs_s2.zip -d ./
unzip repos.zip -d ./
Download and Extract Loss Models
Navigate to the losses
directory and download the following files:
cd losses
loss_model_weights.zip
gaze_models.zip
Extract them within the same losses
directory:
unzip loss_model_weights.zip -d ./
unzip gaze_models.zip -d ./
Instructions on how to run the code, train models, and perform inference will be added here.
This repository is primarily intended for demonstration purposes, allowing enthusiasts to explore the network architecture and training procedures in detail. The primary author is not currently affiliated with academia and may not have the capacity to actively maintain this repository. Community contributions and support are highly encouraged.
A significant factor contributing to the success and quality of the results is the dataset used for training. The original model was trained on a high-quality (HQ) version of the VoxCeleb2 dataset, which is no longer publicly available. However, there are now newer datasets of higher quality and larger scale. Utilizing these can potentially yield even better results, as seen in recent methods that build upon ideas presented in the MegaPortraits paper.
Our FEED dataset (link), introduced in our paper, was instrumental in incorporating asymmetric and extreme emotions into the latent emotion space. We encourage the community to actively use and expand upon this dataset. Given that the final version is slightly smaller (due to some participants withdrawing consent), supplementing it with other datasets containing extreme emotions (e.g., NeRSemble) can enhance model performance, especially when attempting to replicate or improve upon the techniques presented in EMOPortraits.
We are providing version of the pre-trained model weights (located in logs.zip):
This model will be retrained using the same parameters as described in our paper but with 17 IDs in the FEED dataset instead of the original 23. Since the FEED dataset samples were used 25% of the time during training, this change might slightly affect performance in intensive tests.
Please refer to notebooks/E_emo_infer_video.ipynb
We extend our gratitude to all contributors and participants who made this project possible. Special thanks to the developers of the datasets and tools that were instrumental in our research.
This project is licensed under the Creative Commons BY-NC-SA 4.0 license. You are free to use, modify, and distribute this work non-commercially, as long as appropriate credit is given and any derivative works are licensed under identical terms.