galliot-us / smart-social-distancing

Social Distancing Detector using deep learning and capable to run on edge AI devices such as NVIDIA Jetson, Google Coral, and more.
https://neuralet.com
Apache License 2.0
140 stars 39 forks source link

Face-Mask Detector/Classifier #56

Closed mrn-mln closed 3 years ago

mrn-mln commented 4 years ago

Train a face-mask classifier/detector model

mrn-mln commented 4 years ago

https://tryolabs.com/blog/2020/07/09/face-mask-detection-in-street-camera-video-streams-using-ai-behind-the-curtain/

mrn-mln commented 4 years ago

I tried different approaches to solve the face mask detection problem. I will report an overview of the strategies that I have tested. The main process includes two sections: 1- Detection: Detect face from CCTV video. 2- Classification: Classify detected faces to face and face mask classes.

Detection To detect face from CCTV video, I tried different solutions: Firstly, I used SSD-MobileNet V1 and V2 to detect faces. I trained the model with WIDER FACE dataset. The trained model achieved ~0.7 mAP at the WIDER FACE validation set, but when I applied it to a real-world scenario (CCTV videos), the model failed to generalize and missed most of the faces in the majority of frames. Augmentation and fine-tunning were not helpful with the SSD-MobileNet model in this case. Secondly, the tiny face detector model was used for fine-tuning. The model achieved good results in detecting tiny faces in video frames, but the performance was not good when applied on CCTV video. The model was better than the previous model; however, most faces were missed. Finally, after trying different ways, I used Openpifpaf pose estimator. This model worked well on CCTV videos. I extracted faces from Openpifpaf key points.

Classification The first challenge for creating a classifier was the dataset. Currently, there is almost no dataset available for this task. Inspired by a post from Adrian Rosebrock, I started to synthesize faces with masks from CelebA dataset, around 100k (~50k images for each class) images were used for training a MobileNet classifier. After training, the model achieved ~90% accuracy on the validation set. But the model failed to generalize on CCTV data, and it was not robust in most cases. To build a generalizable model, I created a dataset from CCTV videos collected from EarthCam. Openpifpaf was used to detect faces from the videos. The cropped faces were saved and annotated with two labels, face and face mask, manually. Again, a MobileNet model, pre-trained on ImageNet, was trained by the new dataset. The model achieved ~90% accuracy on the validation set. When I deployed the model on CCTV video, the result was more robust. However, the number of parameters of the MobileNet model seemed high for this task. I resized the cropped images from the dataset to 45*45 and implemented a new custom Convolutional Neural Network with ~300k parameters. Custom CNN worked as well as MobileNet and achieved ~89% accuracy at validation.

It seems that using Openpifpaf as a detector and training a custom CNN as a classifier works well. The models are trained to work with videos from CCTV cameras and highly depend on the video environment, i.e., for some environments, the model must be trained with labeled images from the same environment to work well. I will send a pull request with my work on this task.

mrn-mln commented 3 years ago

The Tiny Face detector is added to the face detectors on face mask application. Model Detector engine Please leave your comments on my PRs.

renzodgc commented 3 years ago

Hey, I'm doing a bit of cleanup in these issues.

I'll be closing this one as Facemask detection has already been integrated. Great work btw! :smiley: