davidsandberg / facenet

Face recognition using Tensorflow
MIT License
13.84k stars 4.81k forks source link

Classify new image using trained facenet_train_classifier.py #62

Closed N2ITN closed 7 years ago

N2ITN commented 8 years ago

New to deep learning so this question may be basic- I trained on a set of 37k labelled images using the 'Classifier training of inception resnet v1' guide. It looks like the training went well, but I can't figure out how to use the trained model in production. Any help would be appreciated! Thanks, Z

davidsandberg commented 8 years ago

Hi, Not sure what you mean by "production" in this case. Can you elaborate? And if you haven't checked out the example 'validate_on_lfw' that could be a starting point.

N2ITN commented 8 years ago

My goal is to adapt facenet to emotional classification, then deploy the trained model on IoT devices for live image classification. My main question is how to use the trained classifier.

As I understand, LFW uses pre-selected matching / non matching image pairs (or triplets in the paper) to augment the supervised learning process. Could I adapt that to my dataset via generation of a new pairs.txt, to include file comparisons within the same class?
Additionally, the 128bit feature vector embedding seems relevant for lightweight IoT applications.

davidsandberg commented 8 years ago

If it is a classification problem you want to solve you might as well base your code on the facenet_train_classifier.py code, but instead of running training you calculate the predictions using

predictions = tf.nn.softmax(logits)

on e.g. line 129 and run only the forward pass, i.e. something like

pred = sess.run(predictions, feed_dict=feed_dict)

Some coding is needed but it shouldn't be too difficult.

N2ITN commented 8 years ago

Good to know it's possible - I will try this now. Thank you for your response.

xl94 commented 8 years ago

Hi, I am working on a similar project application too but could not get things running and I am stuck. Able to share the updated code that only classify and not retrain? Thanks! Leia

zavalyshyn commented 8 years ago

@N2ITN did you have any luck with classification?

N2ITN commented 8 years ago

@heroinsoul I think! Feels like I'm almost there, will def post here, clean it up submit a pull request when its ready.

Ripppah commented 8 years ago

@N2ITN How did you get all those training data? I cannot go through that instruction because I don't have those thing. Do I have to use that two dataset? I want to train LFW and add more faces. Could you give a instruction on how to do that?

N2ITN commented 8 years ago

@Ripppah For training I used the 2013 Kaggle facial emotional recognition competition dataset

Ripppah commented 8 years ago

Do I have to re-train the model when every time I add a new face? How can I use that pre-trained model to do the new training? I tried what said on the wiki page. But it doesn't work. Any hint?

N2ITN commented 7 years ago

@davidsandberg

I spent some time reworking the code into a standalone predictor. This works in that it produces predictions, yet their accuracy converges on random chance. This happens even when I train extensively on 120 images in two categories, then test on training images. I must have made a fundamental error somewhere. (note: most of the original/important code is near the bottom in def convert, def calcand def predict Any ideas?

# coding: utf-8
import tensorflow as tf 
import os
from tensorflow.python.framework import ops
import tensorflow.contrib.slim as slim 
import numpy as np
import importlib
import random
import Image
import glob

# facenet code for getting model files from dir
def get_model_filenames(model_dir):
    files = os.listdir(model_dir)
    meta_files = [s for s in files if s.endswith('.meta')]
    if len(meta_files)==0:
        raise ValueError('No meta file found in the model directory (%s)' % model_dir)
    elif len(meta_files)>1:
        raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir)
    meta_file = meta_files[0]
    ckpt_files = [s for s in files if 'ckpt' in s]
    if len(ckpt_files)==0:
        raise ValueError('No checkpoint file found in the model directory (%s)' % model_dir)
    elif len(ckpt_files)==1:
        ckpt_file = ckpt_files[0]
    else:
        ckpt_iter = [(s,int(s.split('-')[-1])) for s in ckpt_files if 'ckpt' in s]
        sorted_iter = sorted(ckpt_iter, key=lambda tup: tup[1])
        ckpt_file = sorted_iter[-1][0]
    return meta_file, ckpt_file

# import model checkpoint
def load_model(model_dir, meta_file, ckpt_file):
    model_dir_exp = os.path.expanduser(model_dir)
    print model_dir_exp
    saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file))
    saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file))
    return

# convert images to their tensor representation
def convert(f):
    current = Image.open(f)
    image_size = current.size[0]
    file_contents = tf.read_file(f)
    name = f.rsplit('/')[-2]
    image = tf.image.decode_png(file_contents)#, channels=3)
    image = tf.image.resize_image_with_crop_or_pad(image, image_size, image_size)
    image.set_shape((image_size, image_size, 3))
    image = tf.image.per_image_whitening(image)
    image = tf.expand_dims(image, 0, name = name)
    return image

# define logits
def calc(image):
    with graph.as_default():
        network = importlib.import_module('src.models.inception_resnet_v1', 'inference')    
        prelogits, _ = network.inference(images_placeholder, 1.0, 
            phase_train=False, weight_decay=0.0,reuse=False)
        logits = slim.fully_connected(prelogits, 7, activation_fn=None,  
            weights_initializer=tf.truncated_normal_initializer(stddev=0.1),  
            weights_regularizer=slim.l2_regularizer(0.), 
            scope='Logits', reuse=False)
        return logits

 # make prediction 
def predict(image, logits):
        predictions = tf.nn.softmax(logits)
        session.run(tf.initialize_all_variables())
        softmax = session.run(predictions,feed_dict={images_placeholder:image.eval()})
        softmax_out = softmax[0].argmax()

        # get target label from image tensor
        target = int(image.name[:1])

        # construct output dict for analysis
        outDict = {'target': target, 'prediction':softmax_out, 'truth':target==softmax_out}
        with open('test_results.txt', 'a') as myfile:
            myfile.write(str(outDict)+ ', ')
        print outDict

# get file list    
files = glob.glob('/home/zach/Documents/fer2013/10/*/*')
random.shuffle(files)
model_dir= '/home/zach/facenet/20161205-135834'

# load graph into session from checkpoint
meta_file, ckpt_file = get_model_filenames(os.path.expanduser(model_dir))
session = tf.InteractiveSession()            
load_model(model_dir,meta_file,ckpt_file)
graph = tf.get_default_graph()
graph_def = graph.as_graph_def()

images_placeholder = tf.placeholder(tf.float32, shape=(None,160,160,3), name='input')
Logits = calc(convert(files[0]))

for x,f in enumerate(files):
    image = convert(f)
    predict(image,Logits)
Ripppah commented 7 years ago

How to add new face by using pretrained model? While adding, the number of class should be changed, but how to do that?

chillerxx commented 7 years ago

hi @N2ITN , have you figured out a way to solve your problem?

nbatfai commented 7 years ago

I think it may be a good solution to use another picture of the same person for classification. I did similar like this in the post http://progpater.blog.hu/2016/12/18/hello_westworld_ford_bernard_es_en

... classes = ["Anthony_Hopkins", "Norbert_Batfai"]

    anotherimage = ['/home/nbatfai/datasets/TRAIN/Anthony_Hopkins/Anthony_Hopkins_0001.png']
    anotherlabel = [0]

    an_image_batch, a_label_batch = facenet.read_and_augument_data(anotherimage, anotherlabel, args.image_size,
        args.batch_size, args.max_nrof_epochs, args.random_crop, args.random_flip, args.nrof_preprocess_threads)

...

    predictions = tf.nn.softmax(logits)

... animage, alabel = sess.run([an_image_batch, a_label_batch])
epoch = 0 while epoch < args.max_nrof_epochs: step = sess.run(global_step, feed_dict=None) epoch = step // args.epoch_size

Train for one epoch

            train(args, sess, epoch, learning_rate_placeholder, global_step, 
                total_loss, train_op, summary_op, summary_writer, regularization_losses, args.learning_rate_schedule_file)

            # Save variables and the metagraph if it doesn't exist already
            save_variables_and_metagraph(sess, saver, summary_writer, model_dir, subdir, step)

            pred = sess.run(predictions, feed_dict={image_batch: animage, label_batch: alebel})                
            print("classification: ", classes[pred[0].argmax()])

(in this snippet a trained image has been recognized)

N2ITN commented 7 years ago

@titaniumer have not tried since I posted that snippet @nbatfai thanks! that looks promising, I will try it out

davidsandberg commented 7 years ago

Closing this since a lot of work on this has been done related to #140.

svitlanacs commented 7 years ago

I have taken the code by @N2ITN and adapted it a little to use the functionality already present in facenet module to get the following code for loading the model and then using it to make predictions:

import tensorflow as tf 
import os
import tensorflow.contrib.slim as slim 
import importlib
import random
from PIL import Image
import glob
import facenet

class FaceNetPredictor:
    # convert images to their tensor representation
    def convert(self, f):
        current = Image.open(f)
        image_size = current.size[0]
        file_contents = tf.read_file(f)
        name = f.rsplit('/')[-2]
        image = tf.image.decode_png(file_contents)#, channels=3)
        image = tf.image.resize_image_with_crop_or_pad(image, image_size, image_size)
        image.set_shape((image_size, image_size, 3))
        #image = tf.image.per_image_whitening(image)
        image = tf.expand_dims(image, 0, name = name)
        return image

    # define logits
    def calc(self, image):
        model_def = "models.inception_resnet_v1"
        network = importlib.import_module(model_def)
        prelogits, _ = network.inference(images_placeholder, 1.0, 
            phase_train=False, weight_decay=0.0,reuse=False)
        batch_norm_params = {
            # Decay for the moving averages
            'decay': 0.995,
            # epsilon to prevent 0s in variance
            'epsilon': 0.001,
            # force in-place updates of mean and variance estimates
            'updates_collections': None,
            # Moving averages ends up in the trainable variables collection
            'variables_collections': [ tf.GraphKeys.TRAINABLE_VARIABLES ],
            # Only update statistics during training mode
            'is_training': False
        }
        bottleneck = slim.fully_connected(prelogits, 128, activation_fn=None, 
                weights_initializer=tf.truncated_normal_initializer(stddev=0.1), 
                weights_regularizer=slim.l2_regularizer(0.0),
                normalizer_fn=slim.batch_norm,
                normalizer_params=batch_norm_params,
                scope='Bottleneck', reuse=False)
        logits = slim.fully_connected(bottleneck, 2, activation_fn=None,  
            weights_initializer=tf.truncated_normal_initializer(stddev=0.1),  
            weights_regularizer=slim.l2_regularizer(0.), 
            scope='Logits', reuse=False)
        return logits

     # make prediction 
    def predict(self, session, image, logits):
            predictions = tf.nn.softmax(logits)
            session.run(tf.global_variables_initializer())
            session.run(tf.local_variables_initializer())
            softmax = session.run(predictions,feed_dict={images_placeholder:image.eval()})
            softmax_out = softmax[0].argmax()
            print("full vector: {}, max: {}", softmax, softmax_out)

# get file list    
files = glob.glob('/tmp/image_samples/*')
random.shuffle(files)
model_dir= '/tmp/facenet_model/20170410-180000'

# load graph into session from checkpoint
with tf.Graph().as_default():
    with tf.Session() as sess:
        fp = FaceNetPredictor()
        print('Model directory: %s' % model_dir)
        meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(model_dir))

        print('Metagraph file: %s' % meta_file)
        print('Checkpoint file: %s' % ckpt_file)
        facenet.load_model(model_dir, meta_file, ckpt_file)

        images_placeholder = tf.placeholder(tf.float32, shape=(None,182,182,3), name='input')
        Logits = fp.calc(fp.convert(files[0]))

        for x,f in enumerate(files):
            print("Predicting for image: %s" % f)
            image = fp.convert(f)
            fp.predict(sess, image,Logits)

The problem for me is that after training the model using 2 groups of files and then testing that model using those exact same files, the output of softmax is pretty much random. Note: I did apply the bottleneck layer as was done in #140 .

Was anybody on this thread able to successfully load the model and have good predictions based on it? Any advice on what I could be doing wrong here?

Also, how is it possible to get accuracy for the predicted label? validate_on_lfw.py is reaching into the computation graph and getting embeddings and phase_train layers -- do any of those correspond to accuracy estimate?

N2ITN commented 7 years ago

@svitlanacs Great to see someone building on my attempts. My strategy was to separate the core of what David has built, to use for a custom dataset and add classification/prediciton. I ran into similar issues, with the softmax layer seemingly showing that something is lost in translation along the pipeline (how can the accuracy actually be worse than chance?? lol). I feel like there is a fundamental and simple mistake or flawed assumption somewhere in the pipe that could fix this quickly, but no joy so far.

Because I couldn't get this to work, I rolled my own face recognition ensemble classifier, using computer vision libraries OpenCV - to crop to faces, and DLib - to extract face keypoints. I then converted the keypoint distances and offests to a vector array and fed it into a 4 layer fully connected multilayer perceptron net with L2 reg and RELU activation using TensorFlow/Keras and thre in a hand built k-fold cross validator. The result was fast AF and quite accurate, and the code is short and simple. I learned that front loading feature selection using traditional means before deep learning on the now small abstracted feature set is faster, has smaller hyperparam tuning space, and even possibly more accurate compared to CNNs.

(need to clean up + comment the code before I feel good about making it a public repo, will post here if/when that happens)

svitlanacs commented 7 years ago

Thank you for your reply, @N2ITN

By looking at other issues opened against this repo and the validate_on_lfw example, I have realized that it is the second-last layer that I need to be looking at and using for classification and comparison, however, being a newbie to this space, I was unable to get it off the ground. In addition, I was really concerned about the stability of the solution here. I would get random crashes during all of: training, forward op, prediction, etc, with those operations succeeding on subsequent retries.

As a result, I have now similarly moved on to using opencv for face bounding boxes, dlib for alignment, and then OpenFace (FaceNet implementation based on Torch) for classification/prediction.

OpenFace, at least for me, has proven to be slower than the comparable OpenCV (LBPH face recognizer using haar cascade face model) approach. As such, I'm really looking forward to the introduction of a reliable and accessible open-source implementation of face recognition on TensorFlow.

zhenglaizhang commented 7 years ago

@Ripppah I met the same question with you, have you got the answer, how we could respond if a new class is added? thanks in advance!

xvdehao commented 7 years ago

@zhenglaizhang hey the same question, do you solve this question? Another question, this program has to run in Linux or windows is alright? Any reply is helpful thx

wch4896 commented 7 years ago

Is there a proper way to do prediction with softmax layer?

Mohamed-Shimy commented 7 years ago

How can I calculate accuracy without validate on LFW ???