Clustering with facenet embeddings

rawmarshmellows commented 7 years ago

Hi all, I was wondering if anyone has attempted to use the facenet embeddings for facial clustering. The reason I ask is because I have been working on this and I was surprised to find that the facenet embeddings are unable discern between just sample images of two people using K-means. Thinking that it might be a cause of the clustering algorithm I used affinity propagation, and even implemented the rank order but regardless of the clustering method that I have tired it doesn't seem to help.

I'm currently in the process of trying to align the face, and I have tried a few methods of alignment but to no avail as well. Has anyone had the same results?

xf4fresh commented 7 years ago

@kevinlu1211 Answer this question just want to discuss about embedding clustering.

My current task is not face recognition, but very similar. I think that embedding clustering should be able to achieve the goal of classification. I found in the experiment that embeddings of the training set does happen in the case of clustering. This can be observed with tensorboard, using t-sne.

rawmarshmellows commented 7 years ago

@xf4fresh I was wondering what pipeline you used to align the face images and create the embeddings. If you are using MTCNN to align and facenet to create the embeddings then I was wondering what hyper parameters you used for t-sne? Just as a test I've used the tsne function from scikit-learn and used embeddings from two people and regardless of the perplexity parameter I used when I project the embedding onto 2 dimensions there are no discernible clusters.

rawmarshmellows commented 7 years ago

@xf4fresh would you be willing to share your code or we could discuss this privately?

xf4fresh commented 7 years ago

@kevinlu1211 You may have misunderstood what I mean. I have no code about t-sne. I used the visualization tool tensorboard which is provided by tensorflow. Hope it is helpful to you.

rawmarshmellows commented 7 years ago

@xf4fresh I meant the code you used to setup the Tensorboard as I'm having no luck trying to set it up. I was able to do it with the MMIST tutorial but not with Facenet

xf4fresh commented 7 years ago

@kevinlu1211 Here is my code, modified from validate_on_lfw.py. Also, how can I keep the format of code in the comment edit box? I just started using github.

def main(args):
    with tf.Graph().as_default():
        config = tf.ConfigProto(log_device_placement=False, allow_soft_placement=True,
                                gpu_options=tf.GPUOptions(allow_growth=True))
        with tf.Session(config=config) as sess:
            # Read the file containing the pairs used for testing
            pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))

            # Get the paths for the corresponding images
            paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)

            # 生成 meta data
            paths = sorted(list(set(paths)))
            paths = paths[0:args.lfw_batch_size]
            generate_metadata_file(paths)

            # Load the model
            print('Model directory: %s' % args.model_dir)

            meta_file, ckpt_file = facenet.get_model_filenames(os.path.expanduser(args.model_dir))

            print('Meta graph file: %s' % meta_file)
            if args.ckpt_file is not None:
                ckpt_file = args.ckpt_file
            print('Checkpoint file: %s' % ckpt_file)
            step = ckpt_file.split("-")[-1]
            print('step: %s' % str(step))
            facenet.load_model(args.model_dir, meta_file, ckpt_file)

            # Get input and output tensors
            images_placeholder = tf.get_default_graph().get_tensor_by_name("batch_join:0")
            batch_size_placeholder = tf.get_default_graph().get_tensor_by_name("batch_size:0")
            embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")

            args.image_height = images_placeholder.get_shape()[1]
            args.image_width = images_placeholder.get_shape()[2]
            embedding_size = embeddings.get_shape()[1]

            embedding = tf.Variable(tf.zeros([args.lfw_batch_size, embedding_size]), name="test_embedding")
            assignment = embedding.assign(embeddings)

            saver = tf.train.Saver(tf.global_variables(), max_to_keep=100)
            # saver = tf.train.Saver({'embeddings': embeddings})
            writer = tf.summary.FileWriter(args.log_dir, sess.graph)

            # Add embedding tensorboard visualization. Need tensor-flow version >= 0.12.0RC0
            config = projector.ProjectorConfig()
            embed = config.embeddings.add()
            embed.tensor_name = embedding.name
            embed.metadata_path = '/home/dxf/CloneProjects/facenet/projector/logs/metadata.tsv'
            projector.visualize_embeddings(writer, config)

            # Run forward pass to calculate embeddings
            print('Running forward pass on test images:%s' % args.lfw_dir)
            batch_size = args.lfw_batch_size
            nrof_images = len(paths)
            nrof_batches = int(math.ceil(1.0 * nrof_images / batch_size))
            # emb_array = np.zeros((nrof_images, embedding_size))
            for i in range(nrof_batches):
                start_index = i * batch_size
                end_index = min((i + 1) * batch_size, nrof_images)
                paths_batch = paths[start_index:end_index]
                images = facenet.load_data_new(paths_batch, args.image_height, args.image_width)
                feed_dict = {images_placeholder: images,
                             phase_train_placeholder: False,
                             batch_size_placeholder: len(paths_batch)}
                # emb_array[start_index:end_index, :] = sess.run(embeddings, feed_dict=feed_dict)
                sess.run(assignment, feed_dict=feed_dict)

                saver.save(sess, os.path.join(args.log_dir, 'a_model.ckpt'), global_step=i)

            # tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array, actual_issame,
            #                                                      nrof_folds=args.lfw_nrof_folds)
            #
            # print('Accuracy: %1.3f+-%1.3f' % (float(np.mean(accuracy)), float(np.std(accuracy))))
            # print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
            #
            # auc = metrics.auc(fpr, tpr)
            # print('Area Under Curve (AUC): %1.3f' % auc)
            # eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
            # print('Equal Error Rate (EER): %1.3f' % eer)

# 生成 meta data
def generate_metadata_file(list_, save_metadata="../projector/logs/metadata.tsv"):
    person_list = []
    for image_path in list_:
        person = image_path.split("/")[-2]
        person_list.append(person)

    write_txt_list_new(person_list, save_metadata, is_append=False)  # write list into txt

def parse_arguments(argv):
    parser = argparse.ArgumentParser()

    parser.add_argument('--lfw_dir', type=str, help='Path to the data directory containing aligned LFW face patches.')
    parser.add_argument('--lfw_batch_size', type=int,
                        help='Number of images to process in a batch in the LFW test set.', default=None)
    parser.add_argument('--model_dir', type=str,
                        help='Directory containing the meta graph (.meta) file and '
                             'the checkpoint (ckpt) file containing model parameters')
    parser.add_argument('--lfw_pairs', type=str,
                        help='The file containing the pairs to use for validation.', default='../data/pairs.txt')
    parser.add_argument('--lfw_file_ext', type=str,
                        help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png'])
    parser.add_argument('--lfw_nrof_folds', type=int,
                        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)

    parser.add_argument('--image_height', type=int, help='Image height in pixels.', default=199)
    parser.add_argument('--image_width', type=int, help='Image width in pixels.', default=13)
    parser.add_argument('--image_channels', type=int, help='Image channels in pixels.', default=1)

    parser.add_argument('--log_dir', type=str, default='../projector/logs', help='Summaries log directory')

    parser.add_argument('--ckpt_file', type=str, help='check point file name.', default=None)
    return parser.parse_args(argv)

if __name__ == '__main__':
    sys.argv = ['validate_on_lfw.py',
                # '--lfw_dir', '../../data/test_data_after_remove_jpg',
                # '--lfw_pairs', './pair_test.txt',

                '--lfw_dir', '../../data/split_data_after_remove_jpg',
                '--lfw_pairs', './pair_train.txt',

                '--lfw_file_ext', 'jpg',
                '--lfw_nrof_folds', '10',
                '--lfw_batch_size', '2000',

                '--image_height', '199',
                '--image_width', '13',
                '--image_channels', '1',

                '--model_dir', '../train_event/',
                '--ckpt_file', 'model-20170315-201740.ckpt-360160']

    main(parse_arguments(sys.argv[1:]))

rawmarshmellows commented 7 years ago

Thanks for your code, and it seems to me that the code you have is already formatted!

davidsandberg commented 7 years ago

Closing this for now. Reopen if needed.

davidsandberg / facenet

Clustering with facenet embeddings #221