Closed N2ITN closed 7 years ago
Hi, Not sure what you mean by "production" in this case. Can you elaborate? And if you haven't checked out the example 'validate_on_lfw' that could be a starting point.
My goal is to adapt facenet to emotional classification, then deploy the trained model on IoT devices for live image classification. My main question is how to use the trained classifier.
As I understand, LFW uses pre-selected matching / non matching image pairs (or triplets in the paper) to augment the supervised learning process. Could I adapt that to my dataset via generation of a new pairs.txt, to include file comparisons within the same class?
Additionally, the 128bit feature vector embedding seems relevant for lightweight IoT applications.
If it is a classification problem you want to solve you might as well base your code on the facenet_train_classifier.py
code, but instead of running training you calculate the predictions using
predictions = tf.nn.softmax(logits)
on e.g. line 129 and run only the forward pass, i.e. something like
pred = sess.run(predictions, feed_dict=feed_dict)
Some coding is needed but it shouldn't be too difficult.
Good to know it's possible - I will try this now. Thank you for your response.
Hi, I am working on a similar project application too but could not get things running and I am stuck. Able to share the updated code that only classify and not retrain? Thanks! Leia
@N2ITN did you have any luck with classification?
@heroinsoul I think! Feels like I'm almost there, will def post here, clean it up submit a pull request when its ready.
@N2ITN How did you get all those training data? I cannot go through that instruction because I don't have those thing. Do I have to use that two dataset? I want to train LFW and add more faces. Could you give a instruction on how to do that?
@Ripppah For training I used the 2013 Kaggle facial emotional recognition competition dataset
Do I have to re-train the model when every time I add a new face? How can I use that pre-trained model to do the new training? I tried what said on the wiki page. But it doesn't work. Any hint?
@davidsandberg
I spent some time reworking the code into a standalone predictor.
This works in that it produces predictions, yet their accuracy converges on random chance. This happens even when I train extensively on 120 images in two categories, then test on training images.
I must have made a fundamental error somewhere.
(note: most of the original/important code is near the bottom in def convert
, def calc
and def predict
Any ideas?
# coding: utf-8
import tensorflow as tf
import os
from tensorflow.python.framework import ops
import tensorflow.contrib.slim as slim
import numpy as np
import importlib
import random
import Image
import glob
# facenet code for getting model files from dir
def get_model_filenames(model_dir):
files = os.listdir(model_dir)
meta_files = [s for s in files if s.endswith('.meta')]
if len(meta_files)==0:
raise ValueError('No meta file found in the model directory (%s)' % model_dir)
elif len(meta_files)>1:
raise ValueError('There should not be more than one meta file in the model directory (%s)' % model_dir)
meta_file = meta_files[0]
ckpt_files = [s for s in files if 'ckpt' in s]
if len(ckpt_files)==0:
raise ValueError('No checkpoint file found in the model directory (%s)' % model_dir)
elif len(ckpt_files)==1:
ckpt_file = ckpt_files[0]
else:
ckpt_iter = [(s,int(s.split('-')[-1])) for s in ckpt_files if 'ckpt' in s]
sorted_iter = sorted(ckpt_iter, key=lambda tup: tup[1])
ckpt_file = sorted_iter[-1][0]
return meta_file, ckpt_file
# import model checkpoint
def load_model(model_dir, meta_file, ckpt_file):
model_dir_exp = os.path.expanduser(model_dir)
print model_dir_exp
saver = tf.train.import_meta_graph(os.path.join(model_dir_exp, meta_file))
saver.restore(tf.get_default_session(), os.path.join(model_dir_exp, ckpt_file))
return
# convert images to their tensor representation
def convert(f):
current = Image.open(f)
image_size = current.size[0]
file_contents = tf.read_file(f)
name = f.rsplit('/')[-2]
image = tf.image.decode_png(file_contents)#, channels=3)
image = tf.image.resize_image_with_crop_or_pad(image, image_size, image_size)
image.set_shape((image_size, image_size, 3))
image = tf.image.per_image_whitening(image)
image = tf.expand_dims(image, 0, name = name)
return image
# define logits
def calc(image):
with graph.as_default():
network = importlib.import_module('src.models.inception_resnet_v1', 'inference')
prelogits, _ = network.inference(images_placeholder, 1.0,
phase_train=False, weight_decay=0.0,reuse=False)
logits = slim.fully_connected(prelogits, 7, activation_fn=None,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
weights_regularizer=slim.l2_regularizer(0.),
scope='Logits', reuse=False)
return logits
# make prediction
def predict(image, logits):
predictions = tf.nn.softmax(logits)
session.run(tf.initialize_all_variables())
softmax = session.run(predictions,feed_dict={images_placeholder:image.eval()})
softmax_out = softmax[0].argmax()
# get target label from image tensor
target = int(image.name[:1])
# construct output dict for analysis
outDict = {'target': target, 'prediction':softmax_out, 'truth':target==softmax_out}
with open('test_results.txt', 'a') as myfile:
myfile.write(str(outDict)+ ', ')
print outDict
# get file list
files = glob.glob('/home/zach/Documents/fer2013/10/*/*')
random.shuffle(files)
model_dir= '/home/zach/facenet/20161205-135834'
# load graph into session from checkpoint
meta_file, ckpt_file = get_model_filenames(os.path.expanduser(model_dir))
session = tf.InteractiveSession()
load_model(model_dir,meta_file,ckpt_file)
graph = tf.get_default_graph()
graph_def = graph.as_graph_def()
images_placeholder = tf.placeholder(tf.float32, shape=(None,160,160,3), name='input')
Logits = calc(convert(files[0]))
for x,f in enumerate(files):
image = convert(f)
predict(image,Logits)
How to add new face by using pretrained model? While adding, the number of class should be changed, but how to do that?
hi @N2ITN , have you figured out a way to solve your problem?
I think it may be a good solution to use another picture of the same person for classification. I did similar like this in the post http://progpater.blog.hu/2016/12/18/hello_westworld_ford_bernard_es_en
... classes = ["Anthony_Hopkins", "Norbert_Batfai"]
anotherimage = ['/home/nbatfai/datasets/TRAIN/Anthony_Hopkins/Anthony_Hopkins_0001.png']
anotherlabel = [0]
an_image_batch, a_label_batch = facenet.read_and_augument_data(anotherimage, anotherlabel, args.image_size,
args.batch_size, args.max_nrof_epochs, args.random_crop, args.random_flip, args.nrof_preprocess_threads)
...
predictions = tf.nn.softmax(logits)
...
animage, alabel = sess.run([an_image_batch, a_label_batch])
epoch = 0
while epoch < args.max_nrof_epochs:
step = sess.run(global_step, feed_dict=None)
epoch = step // args.epoch_size
train(args, sess, epoch, learning_rate_placeholder, global_step,
total_loss, train_op, summary_op, summary_writer, regularization_losses, args.learning_rate_schedule_file)
# Save variables and the metagraph if it doesn't exist already
save_variables_and_metagraph(sess, saver, summary_writer, model_dir, subdir, step)
pred = sess.run(predictions, feed_dict={image_batch: animage, label_batch: alebel})
print("classification: ", classes[pred[0].argmax()])
(in this snippet a trained image has been recognized)
@titaniumer have not tried since I posted that snippet @nbatfai thanks! that looks promising, I will try it out
Closing this since a lot of work on this has been done related to #140.
I have taken the code by @N2ITN and adapted it a little to use the functionality already present in facenet module to get the following code for loading the model and then using it to make predictions:
import tensorflow as tf
import os
import tensorflow.contrib.slim as slim
import importlib
import random
from PIL import Image
import glob
import facenet
class FaceNetPredictor:
# convert images to their tensor representation
def convert(self, f):
current = Image.open(f)
image_size = current.size[0]
file_contents = tf.read_file(f)
name = f.rsplit('/')[-2]
image = tf.image.decode_png(file_contents)#, channels=3)
image = tf.image.resize_image_with_crop_or_pad(image, image_size, image_size)
image.set_shape((image_size, image_size, 3))
#image = tf.image.per_image_whitening(image)
image = tf.expand_dims(image, 0, name = name)
return image
# define logits
def calc(self, image):
model_def = "models.inception_resnet_v1"
network = importlib.import_module(model_def)
prelogits, _ = network.inference(images_placeholder, 1.0,
phase_train=False, weight_decay=0.0,reuse=False)
batch_norm_params = {
# Decay for the moving averages
'decay': 0.995,
# epsilon to prevent 0s in variance
'epsilon': 0.001,
# force in-place updates of mean and variance estimates
'updates_collections': None,
# Moving averages ends up in the trainable variables collection
'variables_collections': [ tf.GraphKeys.TRAINABLE_VARIABLES ],
# Only update statistics during training mode
'is_training': False
}
bottleneck = slim.fully_connected(prelogits, 128, activation_fn=None,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
weights_regularizer=slim.l2_regularizer(0.0),
normalizer_fn=slim.batch_norm,
normalizer_params=batch_norm_params,
scope='Bottleneck', reuse=False)
logits = slim.fully_connected(bottleneck, 2, activation_fn=None,
weights_initializer=tf.truncated_normal_initializer(stddev=0.1),
weights_regularizer=slim.l2_regularizer(0.),
scope='Logits', reuse=False)
return logits
# make prediction
def predict(self, session, image, logits):
predictions = tf.nn.softmax(logits)
session.run(tf.global_variables_initializer())
session.run(tf.local_variables_initializer())
softmax = session.run(predictions,feed_dict={images_placeholder:image.eval()})
softmax_out = softmax[0].argmax()
print("full vector: {}, max: {}", softmax, softmax_out)
# get file list
files = glob.glob('/tmp/image_samples/*')
random.shuffle(files)
model_dir= '/tmp/facenet_model/20170410-180000'
# load graph into session from checkpoint
with tf.Graph().as_default():
with tf.Session() as sess:
fp = FaceNetPredictor()
print('Model directory: %s' % model_dir)
meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(model_dir))
print('Metagraph file: %s' % meta_file)
print('Checkpoint file: %s' % ckpt_file)
facenet.load_model(model_dir, meta_file, ckpt_file)
images_placeholder = tf.placeholder(tf.float32, shape=(None,182,182,3), name='input')
Logits = fp.calc(fp.convert(files[0]))
for x,f in enumerate(files):
print("Predicting for image: %s" % f)
image = fp.convert(f)
fp.predict(sess, image,Logits)
The problem for me is that after training the model using 2 groups of files and then testing that model using those exact same files, the output of softmax is pretty much random. Note: I did apply the bottleneck layer as was done in #140 .
Was anybody on this thread able to successfully load the model and have good predictions based on it? Any advice on what I could be doing wrong here?
Also, how is it possible to get accuracy for the predicted label? validate_on_lfw.py is reaching into the computation graph and getting embeddings and phase_train layers -- do any of those correspond to accuracy estimate?
@svitlanacs Great to see someone building on my attempts. My strategy was to separate the core of what David has built, to use for a custom dataset and add classification/prediciton. I ran into similar issues, with the softmax layer seemingly showing that something is lost in translation along the pipeline (how can the accuracy actually be worse than chance?? lol). I feel like there is a fundamental and simple mistake or flawed assumption somewhere in the pipe that could fix this quickly, but no joy so far.
Because I couldn't get this to work, I rolled my own face recognition ensemble classifier, using computer vision libraries OpenCV - to crop to faces, and DLib - to extract face keypoints. I then converted the keypoint distances and offests to a vector array and fed it into a 4 layer fully connected multilayer perceptron net with L2 reg and RELU activation using TensorFlow/Keras and thre in a hand built k-fold cross validator. The result was fast AF and quite accurate, and the code is short and simple. I learned that front loading feature selection using traditional means before deep learning on the now small abstracted feature set is faster, has smaller hyperparam tuning space, and even possibly more accurate compared to CNNs.
(need to clean up + comment the code before I feel good about making it a public repo, will post here if/when that happens)
Thank you for your reply, @N2ITN
By looking at other issues opened against this repo and the validate_on_lfw example, I have realized that it is the second-last layer that I need to be looking at and using for classification and comparison, however, being a newbie to this space, I was unable to get it off the ground. In addition, I was really concerned about the stability of the solution here. I would get random crashes during all of: training, forward op, prediction, etc, with those operations succeeding on subsequent retries.
As a result, I have now similarly moved on to using opencv for face bounding boxes, dlib for alignment, and then OpenFace (FaceNet implementation based on Torch) for classification/prediction.
OpenFace, at least for me, has proven to be slower than the comparable OpenCV (LBPH face recognizer using haar cascade face model) approach. As such, I'm really looking forward to the introduction of a reliable and accessible open-source implementation of face recognition on TensorFlow.
@Ripppah I met the same question with you, have you got the answer, how we could respond if a new class is added? thanks in advance!
@zhenglaizhang hey the same question, do you solve this question? Another question, this program has to run in Linux or windows is alright? Any reply is helpful thx
Is there a proper way to do prediction with softmax layer?
How can I calculate accuracy without validate on LFW ???
New to deep learning so this question may be basic- I trained on a set of 37k labelled images using the 'Classifier training of inception resnet v1' guide. It looks like the training went well, but I can't figure out how to use the trained model in production. Any help would be appreciated! Thanks, Z