ihp-lab / LibreFace

[WACV 2024] LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis
https://boese0601.github.io/libreface/
Other
96 stars 16 forks source link
action-units csharp-code facial-expression-recognition opensource-projects pytorch-implementation toolbox toolkit wacv2024

LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis

Di Chang, Yufeng Yin, Zongjian Li, Minh Tran, Mohammad Soleymani
Institute for Creative Technologies, University of Southern California WACV 2024
Arxiv | Project page

Introduction

This is the official implementation of our WACV 2024 Application Track paper: LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis. LibreFace is an open-source and comprehensive toolkit for accurate and real-time facial expression analysis with both CPU-only and GPU-acceleration versions. LibreFace eliminates the gap between cutting-edge research and an easy and free-to-use non-commercial toolbox. We propose to adaptively pre-train the vision encoders with various face datasets and then distillate them to a lightweight ResNet-18 model in a feature-wise matching manner. We conduct extensive experiments of pre-training and distillation to demonstrate that our proposed pipeline achieves comparable results to state-of-the-art works while maintaining real-time efficiency. LibreFace system supports cross-platform running, and the code is open-sourced in C# (model inference and checkpoints) and Python (model training, inference, and checkpoints).

Getting started with Python installation

Dependencies

Installation

You can first create a new Python 3.8 environment using conda and then install this package using pip from the PyPI hub:

conda create -n libreface_env python=3.8
conda activate libreface_env
pip install --upgrade libreface

Usage

Using commandline

You can use this package through the command line using the following command.

libreface --input_path="path/to/your_image_or_video"

Note that the above script would save results in a CSV at the default location - sample_results.csv. If you want to specify your own path, use the --output_path command line argument,

libreface --input_path="path/to/your_image_or_video" --output_path="path/to/save_results.csv"

To change the device used for inference, use the --device command line argument,

libreface --input_path="path/to/your_image_or_video" --device="cuda:0"

To save intermediate files, libreface uses a temporary directory that defaults to ./tmp. To change the temporary directory path,

libreface --input_path="path/to/your_image_or_video" --temp="your/temp/path"

For video inference, our code processes the frames of your video in batches. You can specify the batch size and the number of workers for data loading as follows,

libreface --input_path="path/to/your_video" --batch_size=256 --num_workers=2 --device="cuda:0"

Note that by default, the --batch_size argument is 256, and --num_workers argument is 2. You can increase or decrease these values according to your machine's capacity.

Examples

Download a sample image from our GitHub repository. To get the facial attributes for this image and save to a CSV file, simply run,

libreface --input_path="sample_disfa.png"

Download a sample video from our GitHub repository. To run the inference on this video using a GPU and save the results to my_custom_file.csv run the following command,

libreface --input_path="sample_disfa.avi" --output_path="my_custom_file.csv" --device="cuda:0"

Note that for videos, each row in the saved CSV file corresponds to individual frames in the given video.

Using Python script

Here’s how to use this package in your Python scripts.

To assign the results to a Python variable,

import libreface 
detected_attributes = libreface.get_facial_attributes(image_or_video_path)

To save the results to a csv file, use the output_save_path parameter,

import libreface 
libreface.get_facial_attributes(image_or_video_path,
                                output_save_path = "your_save_path.csv")

To change the device used for inference, use the device parameter,


import libreface 
libreface.get_facial_attributes(image_or_video_path,
                                device = "cuda:0") # can be "cpu" or "cuda:0", "cuda:1", ...

To save intermediate files, libreface uses a temporary directory that defaults to ./tmp. To change the temporary directory path, use the temp_dir parameter,

import libreface 
libreface.get_facial_attributes(image_or_video_path,
                                temp_dir = "your/temp/path")

For video inference, our code processes the frames of your video in batches. You can specify the batch size and the number of workers for data loading as follows,

import libreface 
libreface.get_facial_attributes(video_path,
                                batch_size = 256,
                                num_workers = 2)

Note that by default, the batch_size is 256, and num_workers is 2. You can increase or decrease these values according to your machine's capacity.

Weights of the model are automatically downloaded at ./libreface_weights/ directory. If you want to download and save the weights to a separate directory, please specify the parent folder for weights using the weights_download_dir as follows,

import libreface 
libreface.get_facial_attributes(image_or_video_path,
                                weights_download_dir = "your/directory/path")

Getting Started with Derivative Tools (New 2.0 Models Available! Recommended)

We offer several derivative tools on the .NET platform to facilitate easier integration of LibreFace into various systems, in addition to pytorch code. These works are based on ONNX platform weights exported from the model weights of this project.

A screenshot of LibreFace Console Application, showing it built-in documentation.

Training models using Python

Clone repo:

git clone https://github.com/ihp-lab/LibreFace.git
cd LibreFace

The code is tested with Python == 3.7, PyTorch == 1.10.1 and CUDA == 11.3 on NVIDIA GeForce RTX 3090. We recommend you to use anaconda to manage dependencies. You may need to change the torch and cuda version in the requirements.txt according to your computer.

conda create -n libreface python=3.7
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
conda activate libreface
pip install -r requirements.txt

Facial Landmark/Mesh Detection and Alignment

As described in our paper, we first pre-process the input image by mediapipe to obatain facial landmark and mesh. The detected landmark are used to calculate the corresponding positions of the eyes and mouth in the image. We finally use these positions to align the images and center the face area.

To process the image, simpy run following commad:

python detect_mediapipe.py

AU Intensity Estimation

DISFA

Download the DISFA dataset from the official website here. Please be reminded that the original format of the dataset are video sequences, you need to manually process them into image frames.

Download the original video provided by DISFA. Extract it and put it under the folder data/DISFA.

Preprocess the images by previous mediapipe script and you should get a dataset folder like below:

data
├── DISFA
│ ├── images
│ ├── landmarks
│ └── aligned_images
├── BP4D
├── AffectNet
└── RAF-DB

Training/Inference

cd AU_Recognition
bash train.sh
bash inference.sh

AU Detection

BP4D

Download the BP4D dataset from the official website. Extract it and put it under the folder data/BP4D.

Preprocess the images by previous mediapipe script and you should get a dataset folder like below:

data
├── DISFA
│ ├── images
│ ├── landmarks
│ └── aligned_images
├── BP4D
│ ├── images
│ ├── landmarks
│ └── aligned_images
├── AffectNet
└── RAF-DB

Training/Inference

cd AU_Detection
bash train.sh
bash inference.sh

Facial Expression Recognition

AffectNet

Download the AffectNet dataset from the official website. Extract it and put it under the folder data/AffectNet.

Preprocess the images by previous mediapipe script and you should get a dataset folder like below:

data
├── DISFA
│ ├── images
│ ├── landmarks
│ └── aligned_images
├── BP4D
│ ├── images
│ ├── landmarks
│ └── aligned_images
├── AffectNet
│ ├── images
│ ├── landmarks
│ └── aligned_images
└── RAF-DB

Training/Inference

cd Facial_Expression_Recognition
bash train.sh
bash inference.sh

Configure

There are several options of flags at the beginning of each train/inference files. Several key options are explained below. Other options are self-explanatory in the codes. Before running our codes, you may need to change the device, data_root , ckpt_path , data and fold.

Results and Accuracy

We performed a variety of evaluations across several demographic axes to observe how our model performs on different groups of people.

FACES Accuracy

This is the accuracy of the model when performing on the FACES dataset, filtered to only include images in the dataset that were deemed correctly labelled by a majority of human raters. This dataset features a variety of ages, and the evaluation particularly denotes the model's performance on elderly people.

Faces Accuracy Results

CAFE Accuracy

This is the accuracy of the model when performing on the CAFE dataset. This dataset features faces of children across a variety of racial groups. The children in the CAFE dataset are notably younger than those in the FACES dataset, denoting the model's performance on ages 32.5 mos–8.7 yrs and across differnt racial groups. AA = African American, AS = Asian, EA = European American, LA = Latino, PI = Pacific Islander, SA = South Asian

The two columns at the bottom compare human raters on this entire dataset against the performance of our model across each emotion.

Cafe Accuracy Results

TODO:

Package

Validation

License

Our code is distributed under the USC research license. See LICENSE.txt file for more information.

Citation

@inproceedings{chang2023libreface,
      title={LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis}, 
      author={Di Chang and Yufeng Yin and Zongjian Li and Minh Tran and Mohammad Soleymani},
      year={2024},
      booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
      month     = {January},
      note = {To appear}}

Contact

If you have any questions, please raise an issue or email to Di Chang (dchang@ict.usc.eduor dichang@usc.edu). For issues related to the python package, write to Ashutosh Chaubey (achaubey@usc.edu or achaubey@ict.usc.edu).

Acknowledgments

Our code follows several awesome repositories. We appreciate them for making their codes available to public.