ox-vgg / vgg_face2

653 stars 114 forks source link

What are the normalization steps? #17

Closed schudov closed 5 years ago

schudov commented 5 years ago

Hi, I couldn't find details of how the pictures are normalized prior to feeding into network, except for the vector of mean values per channel. Could you give more details regarding normalization? What is the range of tensor values fed into the network?

WeidiXie commented 5 years ago

Hi, @schudov

The preprocessing is just,

Load image (0-255) --> subtract channel-wise mean --> pass to model

Best Weidi

schudov commented 5 years ago

Thanks

vmirly commented 5 years ago

In the paper (page 4, in section Training Implementation Details) it is mentioned "The mean value of each channel is subtracted for each pixel". But this sentence could mean two things, the mean of channels across the training dataset, or the mean of each image computed and used independently. Based on the above answer by @WeidiXie , I assume the mean of each channel for each image independently, so just want to confirm that. Otherwise, the specific mean vector that was used for training would be needed for evaluation. Can you please clarify?

schudov commented 5 years ago

I assume he meant mean channel values of all pictures in dataset. You can find the vector with those values in the archives of the provided pretrained models in readme file. Please, close the issue. Regards

On Sun, 27 Jan 2019, 14:59 Vahid Mirjalili <notifications@github.com wrote:

In the paper (page 4, in section Training Implementation Details) it is mentioned "The mean value of each channel is subtracted for each pixel". But this sentence could mean two things, the mean of channels across the training dataset, or the mean of each image computed and used independently. Based on the above answer by @WeidiXie https://github.com/WeidiXie , I assume the mean of each channel for each image independently, so just want to confirm that. Otherwise, the specific mean vector that was used for training would be needed for evaluation. Can you please clarify?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-457887106, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IWeTjKYccoSABs871Ct957IS4X4rks5vHSQMgaJpZM4aMBNT .

WeidiXie commented 5 years ago

@schudov Thanks.

vmirly commented 5 years ago

I see! I found the mean vector [91.4953, 103.8827, 131.0912] (in BGR order).

Thank you very much!

MarioProjects commented 5 years ago

I have used this code:

x_temp = x_temp[..., ::-1]
x_temp[..., 0] -= 91.4953
x_temp[..., 1] -= 103.8827
x_temp[..., 2] -= 131.0912

We should not normalize then to 0 -> 1 range or std? I have not getting good results loading the senet and using it to verification on ijb-a

WeidiXie commented 5 years ago

Hi, @MarioProjects

You only need to subtract the mean (the values you listed above),

no need to do any other normalisation.

Best, Weidi

schudov commented 5 years ago

You should also mention that you code example is valid for pillow, not for opencv

On Mon, 11 Feb 2019, 23:07 Mario Parreño Lara <notifications@github.com wrote:

I have used this code:

x_temp = x_temp[..., ::-1] x_temp[..., 0] -= 91.4953 x_temp[..., 1] -= 103.8827 x_temp[..., 2] -= 131.0912

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462305583, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9Ia6KGd8cqdyzgDDuaJmobwu_eyHkks5vMV0MgaJpZM4aMBNT .

MarioProjects commented 5 years ago

Thanks @schudov for your fast response! I launch now the model with ijb-a split 1... If not works at all could you help me If I share the code

schudov commented 5 years ago

I'll have a look, sure

On Mon, 11 Feb 2019, 23:13 Mario Parreño Lara <notifications@github.com wrote:

Thanks @schudov https://github.com/schudov for your fast response! I launch now the model with ijb-a split 1... If not works at all could you help me If I share the code

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462307127, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IZfHSHKQx1YTb_iNbpqLR7FI2tMbks5vMV6GgaJpZM4aMBNT .

MarioProjects commented 5 years ago

mmm @schudov I read the files as follows:

from PIL import Image
def load_img(path):
    return np.array(Image.open(path).convert('RGB'))

should I change that for

    return np.array(Image.open(path).convert('BGR'))

??

schudov commented 5 years ago

Assuming your images are loaded as RBG, I did something like that (sorry for terrible code):

import PIL import torchvision as tv import torch as t

def rotate_channels(img): return PIL.Image.merge("RGB", (list(img.split()))[::-1])

to_tensor = tv.transforms.ToTensor() x = to_tensor(rotate_channels(PIL.Image.open('blah')))*255

then you feed x to model

On Mon, Feb 11, 2019 at 11:15 PM Mario Parreño Lara < notifications@github.com> wrote:

mmm @schudov https://github.com/schudov I read the files as follows:

def load_img(path): return np.array(Image.open(path).convert('RGB'))

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462307473, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IaN7w9BigX5gHGZmgfOduutqXmuPks5vMV7igaJpZM4aMBNT .

schudov commented 5 years ago

@MarioProjects I've corrected my comment, this should work as intended.

MarioProjects commented 5 years ago

Hey @schudov lot of thanks for your help!

I updated my script with your code but the distances between 2 same subjects still high (0.74 for the same first subject). Could you check it? https://github.com/MarioProjects/code_share/blob/master/ijb-a_verification-Pretrained.ipynb

Maybe the problem now its on measure_distance()... I am not using MTCNN bounding boxes but it should work more or less the same, there should not be so much difference or this is what I think.

Thanks again for your interest!

WeidiXie commented 5 years ago

@MarioProjects

You are computing cosine similarity, not distance.

Best Weidi

MarioProjects commented 5 years ago

So I should use https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_distances.html , that "is defined as 1.0 minus the cosine similarity" and have a lot of more sense with that 0.74 🗡 The previous L2 normalizations are okey, really?

WeidiXie commented 5 years ago

We have provided the evaluation code. Please check it.

schudov commented 5 years ago

@MarioProjects It seems like you are not using model.eval(). Other than that, i suggest you have a look at the code provided, as Weidi mentioned.

MarioProjects commented 5 years ago

I have been trying and trying to correct with the indications my code but the results I have obtained are not as expected:

With FAR 0.001 -> TAR mean: 0.4570 With FAR 0.01 -> TAR mean: 0.8143 With FAR 0.1 -> TAR mean: 0.9968

The TAR with 0.1 is better than the one reported in the paper but the others are significantly worse. I have updated the code and added how I extract the TAR values in verification_study.... https://github.com/MarioProjects/code_share The main difference is that I am using the bounding boxes provided by ijb... or this is what I think. Could you help me?

Ps: I did model.eval()

wfcb-85 commented 5 years ago

Hello, At the page of another issue (https://github.com/ox-vgg/vgg_face2/issues/10), it is mentioned that a rescaling of the bounding box is also necessary. Would the pre-processing steps be ?:

  1. resize bounding box
  2. Load image (0-255)--> crop face with resized bounding box --> subtract channel-wise mean --> pass to model

Is it possible to get the code for the loading of images ?