Closed schudov closed 5 years ago
Hi, @schudov
The preprocessing is just,
Load image (0-255) --> subtract channel-wise mean --> pass to model
Best Weidi
Thanks
In the paper (page 4, in section Training Implementation Details) it is mentioned "The mean value of each channel is subtracted for each pixel". But this sentence could mean two things, the mean of channels across the training dataset, or the mean of each image computed and used independently. Based on the above answer by @WeidiXie , I assume the mean of each channel for each image independently, so just want to confirm that. Otherwise, the specific mean vector that was used for training would be needed for evaluation. Can you please clarify?
I assume he meant mean channel values of all pictures in dataset. You can find the vector with those values in the archives of the provided pretrained models in readme file. Please, close the issue. Regards
On Sun, 27 Jan 2019, 14:59 Vahid Mirjalili <notifications@github.com wrote:
In the paper (page 4, in section Training Implementation Details) it is mentioned "The mean value of each channel is subtracted for each pixel". But this sentence could mean two things, the mean of channels across the training dataset, or the mean of each image computed and used independently. Based on the above answer by @WeidiXie https://github.com/WeidiXie , I assume the mean of each channel for each image independently, so just want to confirm that. Otherwise, the specific mean vector that was used for training would be needed for evaluation. Can you please clarify?
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-457887106, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IWeTjKYccoSABs871Ct957IS4X4rks5vHSQMgaJpZM4aMBNT .
@schudov Thanks.
I see! I found the mean vector [91.4953, 103.8827, 131.0912]
(in BGR order).
Thank you very much!
I have used this code:
x_temp = x_temp[..., ::-1]
x_temp[..., 0] -= 91.4953
x_temp[..., 1] -= 103.8827
x_temp[..., 2] -= 131.0912
We should not normalize then to 0 -> 1 range or std? I have not getting good results loading the senet and using it to verification on ijb-a
Hi, @MarioProjects
You only need to subtract the mean (the values you listed above),
no need to do any other normalisation.
Best, Weidi
You should also mention that you code example is valid for pillow, not for opencv
On Mon, 11 Feb 2019, 23:07 Mario Parreño Lara <notifications@github.com wrote:
I have used this code:
x_temp = x_temp[..., ::-1] x_temp[..., 0] -= 91.4953 x_temp[..., 1] -= 103.8827 x_temp[..., 2] -= 131.0912
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462305583, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9Ia6KGd8cqdyzgDDuaJmobwu_eyHkks5vMV0MgaJpZM4aMBNT .
Thanks @schudov for your fast response! I launch now the model with ijb-a split 1... If not works at all could you help me If I share the code
I'll have a look, sure
On Mon, 11 Feb 2019, 23:13 Mario Parreño Lara <notifications@github.com wrote:
Thanks @schudov https://github.com/schudov for your fast response! I launch now the model with ijb-a split 1... If not works at all could you help me If I share the code
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462307127, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IZfHSHKQx1YTb_iNbpqLR7FI2tMbks5vMV6GgaJpZM4aMBNT .
mmm @schudov I read the files as follows:
from PIL import Image
def load_img(path):
return np.array(Image.open(path).convert('RGB'))
should I change that for
return np.array(Image.open(path).convert('BGR'))
??
Assuming your images are loaded as RBG, I did something like that (sorry for terrible code):
import PIL import torchvision as tv import torch as t
def rotate_channels(img): return PIL.Image.merge("RGB", (list(img.split()))[::-1])
to_tensor = tv.transforms.ToTensor() x = to_tensor(rotate_channels(PIL.Image.open('blah')))*255
then you feed x to model
On Mon, Feb 11, 2019 at 11:15 PM Mario Parreño Lara < notifications@github.com> wrote:
mmm @schudov https://github.com/schudov I read the files as follows:
def load_img(path): return np.array(Image.open(path).convert('RGB'))
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ox-vgg/vgg_face2/issues/17#issuecomment-462307473, or mute the thread https://github.com/notifications/unsubscribe-auth/Ala9IaN7w9BigX5gHGZmgfOduutqXmuPks5vMV7igaJpZM4aMBNT .
@MarioProjects I've corrected my comment, this should work as intended.
Hey @schudov lot of thanks for your help!
I updated my script with your code but the distances between 2 same subjects still high (0.74 for the same first subject). Could you check it? https://github.com/MarioProjects/code_share/blob/master/ijb-a_verification-Pretrained.ipynb
Maybe the problem now its on measure_distance()... I am not using MTCNN bounding boxes but it should work more or less the same, there should not be so much difference or this is what I think.
Thanks again for your interest!
@MarioProjects
You are computing cosine similarity, not distance.
Best Weidi
So I should use https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_distances.html , that "is defined as 1.0 minus the cosine similarity" and have a lot of more sense with that 0.74 🗡 The previous L2 normalizations are okey, really?
We have provided the evaluation code. Please check it.
@MarioProjects It seems like you are not using model.eval(). Other than that, i suggest you have a look at the code provided, as Weidi mentioned.
I have been trying and trying to correct with the indications my code but the results I have obtained are not as expected:
With FAR 0.001 -> TAR mean: 0.4570 With FAR 0.01 -> TAR mean: 0.8143 With FAR 0.1 -> TAR mean: 0.9968
The TAR with 0.1 is better than the one reported in the paper but the others are significantly worse. I have updated the code and added how I extract the TAR values in verification_study.... https://github.com/MarioProjects/code_share The main difference is that I am using the bounding boxes provided by ijb... or this is what I think. Could you help me?
Ps: I did model.eval()
Hello, At the page of another issue (https://github.com/ox-vgg/vgg_face2/issues/10), it is mentioned that a rescaling of the bounding box is also necessary. Would the pre-processing steps be ?:
Is it possible to get the code for the loading of images ?
Hi, I couldn't find details of how the pictures are normalized prior to feeding into network, except for the vector of mean values per channel. Could you give more details regarding normalization? What is the range of tensor values fed into the network?