Real-Time Face Recognition and Detection using Tensorflow

The idea is to build application for a real-time face detection and recognition using Tensorflow and a notebook's webcam. The model for face prediction should be easy to update online to add new targets.

Project assumptions

Tensorflow 1.7 and python 3
Everything should be dockerized and easy to reproduce!

How to run it?

Run it right away from the pre-build image

Just type:

docker run -it --rm -p 5000:5000 btwardow/tf-face-recognition:1.0.0

Then got to https://localhost:5000/ or type it in your browser to get face detection (without recognition for now).

Note: HTTPS is required from many modern browsers to transfer video outside the localhost, without making any unsafe settings to your browser.

Building docker

Type this in the root project's directory in order to:

Create docker image

Use main target from Makefile of main directory:

make

Run project

docker run --rm -it -p 5000:5000 -v /$(pwd):/workspace btwardow/tf-face-recognition:dev

This volume mapping is very convenient for the development and testing purposes.

To use GPU power - there is dedicated Dockerfile.gpu.

Run it without docker (development)

Running application without docker is useful for development. Below is quick how to for *nix environments.

Creating virtual env (with Conda) and installing requirements:

conda create -y -n face_recognition_36 python=3.6
source activate face_recognition_36
pip install -r requirements_dev.txt

Downloading pre-build models:

mkdir ~/pretrained_models
cp docker/download*.py ~/pretrained_models
cd ~/pretrained_models
python download.py
python download_vggace2.py

the ~/pretrained_models directory should look like that:

(face_recognition_36) b.twardowski@172-16-170-27:~/pretrained_models » tree
.
├── 20180402-114759
│   ├── 20180402-114759.pb
│   ├── model-20180402-114759.ckpt-275.data-00000-of-00001
│   ├── model-20180402-114759.ckpt-275.index
│   └── model-20180402-114759.meta
├── 20180402-114759.zip
├── det1.npy
├── det2.npy
├── det3.npy
├── download.py
└── download_vggace2.py

Then, to start a server, go to ./server directory and type:

PYTHONPATH=".." python server.py

Why making a web application for this?

Everything should be dockerized and easy to reproduce. This makes things interesting even for a toy project from the computer vision area. Why?

building model/playing around in Jupyter/Python - that's easy... inference
on data grabbed from the host box camera inside docker - that's tricky!

Why is hard to grab data from camera device from docker? You can read here. The main reason - docker is not build for such things, so it's not making life easier for here. Of course few possibilities are mentioned, like streaming from the host MBP using ffmpeg or preparing custom Virtualbox boot2docker.iso image and making the MBP webcam pass through. But all of them dosn't sound right. All requires additiona effort of installing sth from brew or Virualbox configuration (assuming you have docker installed on your OSX).

The good side of having this as a webapp is fact that you can try it out on your mobile phone! What is very convenient for testing and demos.

Face detection

Face detection is done to find faces from the video and mark it boundaries. These are areas that can be future use for the face recognition task. To detect faces the pre-trained MTCNN network is being used.

Face recognition

The face detection is using embedding from the VGGFace2 network + KNN model implemented in Tensorflow.

In order to get your face recognized first a few examples have to be provided to our algorithm (now - at least 10).

When you see the application working and correctly detecting faces just click the Capture Examples button.

While capturing examples for the face detection there have to be single face in video!

After 10 examples are collected, we can type the name of the person and upload them to server.

As a result we see the current status of classification examples:

And from now on, the new person is recognized. For this example it's CoverGirl.

One more example

Running Jupyter Notebook and reproducing analysis

If you are interested about the classification, please check out this notebook which will explain in details how it works (e.g. threshold for the recognition).

You can run jupyter notebook from the docker, just type:

docker run --rm -it -p 8888:8888 btwardow/tf-face-recognition:1.0.0 /run_jupyter.sh --allow-root

TODOs

[x] face detection with a pre-trained MTCNN network
[x] training face recognition classifier (use pre-trained embedding + classifier) based on provided examples
[x] model updates directly from the browser
[ ] save & clear classification model from the browser
[ ] check if detection can be done faster, if so re-implement it (optimize MTCNN for inference?)
[ ] try out port it to Trensorflow.js (as skeptical as I am of crunching numbers in JavaScript...)

Thanks

Many thanks to creators of facenet project, which provides pre trained models for VGGFace2. Great job!

btwardow / tf-face-recognition

readme