An attempt at an end to end pipeline for facial clustering
Introduction
This is an end to end pipeline for facial clustering of photos (similar to what Apple and Android has). Please look at pipeline/Facial Clustering.ipynb
to see the current progress.
Method
There are three main steps in creating an end to end pipeline for the facial clustering of images:
-
Detecting and creating the bounding boxes for the faces
- For detecting and creating the bounding boxes for the faces I'm using the Multi-Task Cascaded Convolutional Networks ( MTCNN) which has 3 CNNs with the output of one network feeding into the next. Each step refines the bounding boxes for the faces, with the first network being looser on the conditions for the bounding boxes and the last network being the most strict.
-
Creating the embeddings for the faces
- For creating the embeddings I'm using a modified version FaceNet which is a Resnet-Inception model that creates a 128 dimensional embedding for a face.
-
Clustering the faces
- For clustering the faces I'm using the Rank-Order clustering algorithm. This is a kind of agglomerative clustering technique, which merges the embeddings based on the rank-order distance, and a cluster-level normalized distance.
As I was unable to train the models myself David Sandberg has graciously trained MTCNN and Resnet-Inception models and I was able to modify his code to my use case. Here is a link to the Github page
To setup the environment and install required packages
Assuming that you are using the Anaconda package manager.
- Download/Clone the repo and
cd
to it
- Create a new virtual environment:
conda create -n facialClustering python=3
- Install Tensorflow:
pip install tensorflow
- Download the weights of Resnet-Inception from here and put it in the
pipeline
directory. I have already included two folders of faces which are subsets from the Labeled Faces in the Wild dataset.
- Go into the
pipeline
folder and open Jupyter Notebook: jupyter notebook
- Open
Facial Clustering.ipynb
- Hopefully you will be able to open it, if it doesn't work you can download the
Facial+Clustering.html
and read the notebook.
Results
TODO:
- Use alignment technique described in Face Search at Scale : 80 Million Gallery to speed up the clustering process
- Create an efficient data structure for obtaining the distances between clusters, instead of having to recalculate them after each clustering iteration.
- Use factory pattern to make usage of different methods for face detection/alignment cleaner
Done:
Use Delaunay Triangulation to align faces not useful
Understand and edit code from facenet and repurpose the MTCNN code to retrieve multiple faces instead of one
Use facenet to find the deep features
Use K-means to cluster not useful
Use Affinity Propagation to cluster not useful
Read and implement part of the rank order clustering from A Rank-Order Distance based Clustering Algorithm for Face Tagging)[https://pdfs.semanticscholar.org/efd6/4b7641bea8ca536f4e179be6e2dd25d519d6.pdf]
Use t-SNE using Tensorboard to visualize the embeddings to see if they actually cluster