This repository provides a neat package to efficiently train and test state-of-the-art face recognition models.
Partial-FC training is supported for all the methods.
If you find this repo is helpful, please cite our paper.
@article{wu2024identity,
title={Identity Overlap Between Face Recognition Train/Test Data: Causing Optimistic Bias in Accuracy Measurement},
author={Wu, Haiyu and Tian, Sicong and Gutierrez, Jacob and Bhatta, Aman and {\"O}zt{\"u}rk, Ka{\u{g}}an and Bowyer, Kevin W},
journal={arXiv preprint arXiv:2405.09403},
year={2024}
}
I suggest you to use Anaconda to better control the environments
conda create -n fr_training python=3.8
conda install -n fr_training pytorch==1.12.0 torchvision==0.13.0 cudatoolkit=11.3 -c pytorch
conda activate fr_training
Then clone the package and use pip to install the dependencies
git clone https://github.com/HaiyuWu/SOTA-FR-train-and-test.git
cd ./SOTA-FR-train-and-test
pip install -r requirements.txt
Hadrian and Eclipse are face recognition test sets oriented around facial hairstyles and face exposure levels, respectively. You can download both datasets via GDrive. To get the password, you need to fill this form, then you can follow Test sets to prepare the dataset. If you find Hadrian and Eclipse help any of your projects, please cite the following reference:
@article{GoldilocksFRTestSet2024,
title={What is a Goldilocks Face Verification Test Set?},
author={Wu, Haiyu and Tian, Sicong and Bhatta, Aman and Gutierrez, Jacob and Bezold, Grace and Argueta, Genesis and Ricanek Jr., Karl and King, Michael C. and Bowyer, Kevin W.},
year={2024}
}
The data used in the Hadrian and Eclipse datasets are fully based on the commercial version of MORPH5. We sincerely and heartfelt appreciate the invaluable support from Prof. Karl Ricanek Jr. and University of North Carolina Wilmington (UNCW). UNCW has granted permission to use the images in Hadrian and Eclipse FREE for research purposes. You can get the full academic and commercial MORPH datasets at official webpage.
If you have any problems using Hadrian and Eclipse, please contact: cvrl@nd.edu
If you just want to use the .bin version, using xz2bin.py to convert.
You can directly download the compressed MS1MV2 , WebFace4M, Glint360K. Extract them at datasets folder and they are ready-to-use.
All the other training sets could be found at insightface. After extracting the training set by using rec2image.py, using file_path_extractor.py to gather the image paths of the training set. Then run imagelist2lmdb.py to finish the training set preparation.
python3 utils/imagelist2lmdb.py \
--image_list file/of/extracted/image/paths
--destination ./datasets
--file_name dataset/name
We support using .txt file to train the model. Using file_path_extractor.py to get all the image paths and replacing the path in config file.
config.train_source = "path/to/.txt/file"
LFW, CFP-FP, CALFW, CPLFW, AgeDB-30 can be downloaded here. Extract the compressed file then you can simply run prepare_test_images.py to get datasets ready to test
python3 utils/prepare_test_images.py \
--xz_folder folder/contains/xz/files
--destination ./test_set_package_5
--datasets lfw cfp_fp agedb_30 calfw cplfw
After finishing the training and testing sets preparation, you train your own model by:
torchrun --nproc_per_node=4 train.py --config_file ./configs/arcface_r100.py
You can change the settings at configs.
Note that, AdaFace and MagFace use BGR channel to train the model, but this framework consistently uses RGB to train the model. Also, for MagFace, it uses mean=0.
and std=1.
to normalize the images, but this framework uses mean=0.5
and std=0.5
to train and test all the methods.
If you want to train the model align with the training in the original GitHub repository, you can change the data_loader_train_lmdb.py file.
For CosFace, SphereFace, ArcFace, CurricularFace, UniFace, adding --add_flip
option to test. For AdaFace, adding --add_norm
option to test.
python3 test.py \
--model_path path/of/the/weights \
--model iresnet \
--depth 100 \
--val_list lfw cfp_fp agedb_30 calfw cplfw \
--val_source ./test_sets
The trained weights can be downloaded at model zoo
Using file_path_extractor.py to collect the paths of the target images, then run following command to extract the features.
python3 feature_extractor.py \
--model_path path/of/the/weights \
--model iresnet \
--depth 100 \
--image_paths image/path/file \
--destination feature/destination
Now, we support testing for ArcFace (CVPR19), CurricularFace(CVPR20), MagFace(CVPR21), AdaFace(CVPR22), and TransFace(ICCV23).
Testing the pre-trained model (e.g. ArcFace-R100) on LFW, CFP-FP, CALFW, CPLFW, AgeDB-30 by
cd ./sota_test
python3 arcface_test.py \ --model_path path/of/pre-trained/model \ --net_mode ir \ --depth 100 \ --batch_size 512 \ --val_list lfw cfp_fp agedb_30 calfw cplfw \ --val_source ../test_set_package_5
### Acknowledgement
Thanks for the valuable contribution of [InsightFace](https://github.com/deepinsight/insightface/tree/master) in face area!
## Publications
[1] Identity Overlap Between Face Recognition Train/Test Data: Causing Optimistic Bias in Accuracy Measurement
[2] What is a Goldilocks Face Verification Test Set?
[3] Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
## TODO list
Functions:
- [ ] resume from training
Methods:
- [ ] Circle loss
## License
[MIT license](./license.md)