ox-vgg / vgg_face2

646 stars 114 forks source link

how to extract low dimensional embedding? #35

Closed troy0215 closed 4 years ago

troy0215 commented 4 years ago

I used model parameters in senet50_128_pytorch and used my own pictures as inputs. Then select output varible feat_extract in senet50_128.py as feature vectors. Finally normlize and compute cosine of each feature pairs. I have 4 different peoples, but for any pair of them, cosine values are all beyond 0.95. So how to choose feature vector? Dose parameters to generate bounding boxes affect feature vector significantly?

WeidiXie commented 4 years ago

Hi, did you normalise your input image by subtracting the mean ? The mean is written in the senet50_128.py.

troy0215 commented 4 years ago

Hi, i tried normalise the inputs as you say just now, but it didn't work. I also tried senet50_256

WeidiXie commented 4 years ago

Do you mind trying the images given in the samples/tight_crop.

If this doesn't work, then there must be something wrong.

troy0215 commented 4 years ago

still not work. first the extended bounding box of the face is resized so that the shorter side is 256 pixels; then the centre 224 224 crop of the face image is used as input to the network. this is part in your paper, and i directly resize the shorter side of pictures in tight_crop to 256 and then extract center 224224 as inputs

WeidiXie commented 4 years ago

OK, I have tested the model, it works perfectly fine. I test on the images in the samples/tight_crop Here is the code:

# Code:
from __future__ import absolute_import
from __future__ import print_function
import os
import sys
import pdb
import PIL
import torch
import glob as gb
import numpy as np
from PIL import Image

batch_size = 10
mean = (131.0912, 103.8827, 91.4953)

def load_data(path='', shape=None):
    short_size = 224.0
    crop_size = shape
    img = PIL.Image.open(path)
    im_shape = np.array(img.size)    # in the format of (width, height, *)
    img = img.convert('RGB')
    ratio = float(short_size) / np.min(im_shape)
    img = img.resize(size=(int(np.ceil(im_shape[0] * ratio)),   # width
                           int(np.ceil(im_shape[1] * ratio))),  # height
                     resample=PIL.Image.BILINEAR)

    x = np.array(img)  # image has been transposed into (height, width)
    newshape = x.shape[:2]
    h_start = (newshape[0] - crop_size[0])//2
    w_start = (newshape[1] - crop_size[1])//2
    x = x[h_start:h_start+crop_size[0], w_start:w_start+crop_size[1]]
    x = x - mean
    return x

def chunks(l, n):
    # For item i in a range that is a length of l,
    for i in range(0, len(l), n):
        # Create an index range for l of n items:
        yield l[i:i+n]

def initialize_model():
    # Set basic environments.
    # Initialize GPUs
    import resnet50_128 as model
    network = model.resnet50_128(weights_path='../model/resnet50_128.pth')
    network.eval()
    return network

def image_encoding(model, facepaths):
    print('==> compute image-level feature encoding.')
    num_faces = len(facepaths)
    face_feats = np.empty((num_faces, 128))
    imgpaths = facepaths
    imgchunks = list(chunks(imgpaths, batch_size))

    for c, imgs in enumerate(imgchunks):
        im_array = np.array([load_data(path=i, shape=(224, 224, 3)) for i in imgs])
        f = model(torch.Tensor(im_array.transpose(0, 3, 1, 2)))[1].detach().cpu().numpy()[:, :, 0, 0]
        start = c * batch_size
        end = min((c + 1) * batch_size, num_faces)
        # This is different from the Keras model where the normalization has been done inside the model.
        face_feats[start:end] = f / np.sqrt(np.sum(f ** 2, -1, keepdims=True))
        if c % 50 == 0:
            print('-> finish encoding {}/{} images.'.format(c * batch_size, num_faces))
    return face_feats

if __name__ == '__main__':
    facepaths = gb.glob('../samples/*/*.jpg')
    model_eval = initialize_model()
    face_feats = image_encoding(model_eval, facepaths)
    S = np.dot(face_feats, face_feats.T)
    import pylab as plt
    plt.imshow(S)
    plt.show()

And you should expect to see the similarity matrix as : Figure_1

troy0215 commented 4 years ago

problem solved, thank you very much

WeidiXie commented 4 years ago

Cool.