Copyright (C) 2021 - SecurifAI
This package contains free data and software: you can use, redistribute and/or modify it under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
This data set and software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The complete license agreement can be consulted at: CC BY-NC-SA 4.0.
Please cite the corresponding work (see citation.bib file to obtain the citation in BibTex format) if you use this data set and software (or a modified version of it) in any scientific work:
[1] Tudor Mare, Georgian Duță, Mariana-Iuliana Georgescu, Adrian Șandru, Bogdan Alexe, Marius Popescu, Radu Tudor Ionescu. A realistic approach to generate masked faces applied on two novel masked face recognition data sets. In Proceedings of NeurIPS, 2021 (link to ArXiv version).
The COVID-19 pandemic raises the problem of adapting face recognition systems to the new reality, where people may wear surgical masks to cover their noses and mouths. Traditional data sets (e.g., CelebA, CASIA-WebFace) used for training these systems were released before the pandemic, so they now seem unsuited due to the lack of examples of people wearing masks. We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images. Our method relies on Spark AR Studio, a developer program made by Facebook that is used to create Instagram face filters. In our approach, we use 9 masks of different colors, shapes and fabrics. We employ our method to generate a number of 445,446 (90%) samples of masks for the CASIA-WebFace data set and 196,254 (96.8%) masks for the CelebA data set.
Our repository contains:
The masks for the CelebA data set are available for download at:
The masks for the CASIA-WebFace data set are available for download at:
For convenience, we provide Python scripts to apply the masks on the original CelebA and CASIA-WebFace images.
To run the script on the CelebA / CASIA-WebFace data set, extract the respective archive in the same folder as the CelebA / CASIA-WebFace main data set folder. Inside each script there is a celeba_folder / casia_folder parameter and a masks_folder parameter which have to be set accordingly. The output of the script will be located in the masked_celeba or masked_casia folder, respectively.
Make sure to install the packages listed in requirements.txt before running the scripts.
Make sure you have the following folder structure on your machine:
main directory
│ apply_masks_celeba.py
└───celeba_masks - masks folder (from this repo)
│ │ ...
└───celeba - data set folder (original images)
│ │ ...
└───masked_celeba - output folder (images with overlaid masks)
│ ...
Then use the following command:
>> python apply_masks_celeba.py
Make sure you have the following folder structure on your machine:
main directory
│ apply_masks_casia.py
└───casia_masks - masks folder (from this repo)
│ │ ...
└───casia - data set folder (original images)
│ │ ...
└───masked_casia - output folder (images with overlaid masks)
│ ...
Then use the following command:
>> python apply_masks_casia.py
We are happy to hear your feedback and suggestions at: tudor[dot]mare{at}securif(dot)ai