AlfredXiangWu / face_verification_experiment

Original Caffe Version for LightCNN-9. Highly recommend to use PyTorch Version (https://github.com/AlfredXiangWu/LightCNN)
725 stars 325 forks source link

Extract features with MATLAB #60

Closed mhasnat closed 8 years ago

mhasnat commented 8 years ago

Hi,

I would like to know if you have a Matlab program to extract the features. I have written one but did not get the results properly. For a single image (i.e., batch size = 1), my program is as follows:


modelName = 'LightenedCNN_A_deploy.prototxt'; modelWeight = 'LightenedCNN_A.caffemodel'; caffe.set_device(1); caffe.set_mode_gpu();

% Define net net = caffe.Net(modelName, modelWeight, 'test');

% read image I = rgb2gray(imread('imgName.jpg'));

% get features J(:,:,1,1) = I; net.forward({J}); fts = net.blobs('eltwise6').get_data()';


I get features of 256 dimension. However, with these features I obtained very low accuracy. I have pre-processed the images with the normalization program written by you.

Looking forward for your feedback.

Thank you.

AlfredXiangWu commented 8 years ago

I trained the caffemodel via caffe-rc version which wase released on 19 Sep 2014. If you use caffe-rc2 or caffe-rc3, the model might not be used.

I don't have matlab feature extraction program, you can use python code(https://github.com/AlfredXiangWu/python_misc/blob/master/caffe/caffe_ftr.py) or C++ code(https://github.com/AlfredXiangWu/caffe/blob/hybrid_dev/tools/extract_features_to_file.cpp) to extract features.

BTW, recently maybe I would push new caffemodel trained by caffe-rc3 on the github.

mhasnat commented 8 years ago

Thank you for your answer. Can you please answer the following questions:

  1. I do not pre-process the image much. I just convert to grayscale. Is it ok? or should I do more?
  2. I take features from the blob called 'eltwise6'. Is it ok?
AlfredXiangWu commented 8 years ago
  1. Data pre-processing is necessary, otherwise the accuracy would be dropped.
  2. OK
mhasnat commented 8 years ago

Thank you for your answer. Can you please provide additional information or verify my pre-processing steps for any given image from LFW:

Step 1: Detect landmark and normalize it using your Matlab function (align_face()). Step 2: Convert to grayscale.

please let me know if any more steps are necessary.

Thank you.

dejunzhang commented 8 years ago

Hi @AlfredXiangWu , i do the data pre-processing, normalize the mean file, and minmax input images(reference this code: https://github.com/happynear/DeepVisualization/blob/master/FaceVis/Inceptionism_face.m), and extract the features of one face. but when i compute the cosine similarity between the input images and other images. i found that . if the same person in comparison image and input images ,the similarity is about 0.95 ~ 0.99. if the different person in comparison image, the similarity is about 0.8 ~ 0.9, but the appearance of them are very different. so i think there might be something wrong in some where.

Looking forward for your kindly reply.

AlfredXiangWu commented 8 years ago

@dejunzhang Do you convert the pixel value from [0, 255] to [0, 1]?

dejunzhang commented 8 years ago

@AlfredXiangWu , yes. i divide the pixel value of webface_mean.proto by 256. and convert the pixel value of input image from [0,255] to [0,1].

AlfredXiangWu commented 8 years ago

@dejunzhang I am sorry I have no idea about your problems. The example of aligned image is shown in issue #4. Maybe you could compare it with your aligned images.

dejunzhang commented 8 years ago

@AlfredXiangWu , Thanks for your suggestions. I compare the aligned image with issue #4. and they are very similar i think. Do you have any opinion about the range of the cosine similarity(same person or different person) in normal case. my user case is: I use the web camera to capture the real-time images, and then do face detection and face alignment. I found the range of the cosine similarity between two different Chinese people is about 0.8 ~ 0.94. and 0.9 ~ 0.99 for the same Chinese people.

dejunzhang commented 8 years ago

@AlfredXiangWu By the way, i might find a bugs when doing face alignment in this file, line 118: https://github.com/AlfredXiangWu/face_verification_experiment/blob/master/code/face_db_align.m when i reference the above code to do face alignment in c++. It should be changed from: eyec2 = (eyec - [size(img_rot,2)/2 size(img_rot,1)/2]) * resize_scale + [size(img_resize,2)/2 size(img_resize,1)/2];

to eyec2 = eyec*resize_scale

Could you please check that? thanks a lot.

AlfredXiangWu commented 8 years ago

@dejunzhang If convenient, could you provide the testing set for me? My email address is alfredxiangwu@gmail.com.

dejunzhang commented 8 years ago

@AlfredXiangWu,After carefully checking and fix some bugs: 1,The similarity among the same person is about 0.7 ~ 0.99 because of different direction of the person. 2, The similarity among the different person is about 0.45 ~ 0.8.

I will send you some test faces to you later. Thank a lot.

globallocal commented 8 years ago

Hello, Mr Wu:

Firstly, I think I get more information here than just reading your paper. Thank you very much !
Besides, I am wonder how you obtain the released Web-scale model and perform the experiments. I browse the web pages but have no findings.

Looking forward to your reply. Best wishes.

AlfredXiangWu commented 8 years ago

@globallocal The Web-Scale refers to the Facebook's paper "Web-scale training for face identification"? I have not obtained the model. The performance is reported in their paper.

globallocal commented 8 years ago

Oh, Oh, thank you so much. Best wishes !

globallocal commented 8 years ago

Hi! Mr Wu: Two more questions !

  1. Your paper just say the method is "unsupervised", what's the type of the restricted or unrestricted you use?
  2. Your accuracy is just a number, how you define it ? Average over the 6000 pairs? Or Average the 10 fold? Thank you.
AlfredXiangWu commented 8 years ago

@globallocal

  1. The "unsupervised" means the model doesn't train on LFW. I only trained the model on CASIA-WebFace and then I extract the features on LFW and compute the cosine similarity between two images as the score.
  2. I directly obtain the scores of 6000 pairs and then evaluate them on ROC curves.
globallocal commented 8 years ago

Well, I think I will obey the 10 fold proposal before giving the performace reporting. Thank you for your answer.

globallocal commented 8 years ago

I actually don't understand the unsupervised setting. As you say, unsupervised means it is not trained on LFW in supervised way.

What's the trainning set for verification task you use when you set the threshod based on the cos metric? Thank you.

AlfredXiangWu commented 8 years ago

@globallocal The accuracy reported in the paper actually means "100% - EER". The equal error rate(EER) is easily obtained from the ROC curve.

globallocal commented 8 years ago

I'm sorry, I haven't read your code yet. I will read it over. Thank you.