hollance / BlazeFace-PyTorch

The BlazeFace face detector model implemented in PyTorch
Other
429 stars 91 forks source link

BlazeFace in Python

BlazeFace is a fast, light-weight face detector from Google Research. Read more, Paper on arXiv

A pretrained model is available as part of Google's MediaPipe framework.

Besides a bounding box, BlazeFace also predicts 6 keypoints for face landmarks (2x eyes, 2x ears, nose, mouth).

Because BlazeFace is designed for use on mobile devices, the pretrained model is in TFLite format. However, I wanted to use it from PyTorch and so I converted it.

NOTE: The MediaPipe model is slightly different from the model described in the BlazeFace paper. It uses depthwise convolutions with a 3x3 kernel, not 5x5. And it only uses "single" BlazeBlocks, not "double" ones.

The BlazePaper paper mentions that there are two versions of the model, one for the front-facing camera and one for the back-facing camera. This repo includes only the frontal camera model, as that is the only one I was able to find an official trained version for. The difference between the two models is the dataset they were trained on. As the paper says,

For the frontal camera model, only faces that occupy more than 20% of the image area were considered due to the intended use case (the threshold for the rear-facing camera model was 5%).

This means the included model will not be able to detect faces that are relatively small. It's really intended for selfies, not for general-purpose face detection.

Inside this repo

Essential files:

Notebooks:

Detections

Each face detection is a PyTorch Tensor consisting of 17 numbers:

Image credits

Included for testing are the following images:

These images were scaled down to 128x128 pixels as that is the expected input size of the model.