pterhoer / FaceImageQuality

Code and information for face image quality assessment with SER-FIQ
534 stars 90 forks source link

about the result #3

Closed sunny0315 closed 4 years ago

sunny0315 commented 4 years ago

I try to run the file 'serfiq_example.py' and got the result of two test images which you providied in './data/'. The score of 'test_img.jpeg' is 0.89 ,and the score of 'test_img2.jpeg' is 0.87.Are the resuls correct? Yet,I test on other images like side faces and front faces.But their results are indistinguishable,Is there something wrong with my use? Thank you for your reply!

codelilei commented 4 years ago

The same problem as above. I have also tried to save 'pre_fc1_bias.npy' and 'pre_fc1_weights.npy' manually, sicnce the arcface pretrained model is not spcified by the author, but the problem still exists.

jankolf commented 4 years ago

Hi, the results or scores seem to be correct.

@sunny0315: If you look at the following paper, what is also shown in the repository, you will see that the score distribution of SER-FIQ on Arcface is mostly in the range of 0.86-0.90: https://arxiv.org/abs/2004.01019. Here is an image of a score distribution for different ethnics on the ColorFeret dataset. Depending on how well Arcface can handle an input image, the score will change. Details can be taken from the two papers in the repository. If Arcface/Insightface is able to produce a good, stable embedding from the image, the score will be higher than with an image where Arcface is "less stable".

@codelilei : The weights stored in the data folder are the weights that were used for the experiments. We have used a pretrained model from Insightface's model zoo. The model used in our experiments is the "LResNet100E-IR,ArcFace@ms1m-refine-v2" pre-trained model, downloaded in April 2019.

If you have further questions, please do not hesitate to ask them.

Best regards, Jan

pterhoer commented 4 years ago

Hi everyone, first of all, thanks for your interesting in our work!

As Jan already mentioned, unfortunately, SER-FIQ on ArcFace produces very narrow quality estimates. Although this narrow quality range is unconvienet, it is still meaningful! (If you take more than 2 decimal places into account).

To get a more "natural" quality range, you can simply use scaling methods, such as MinMax normalization. Or, if you are interested, we can add a scaling parameter to the model to output quality scores in a convient range of [0,1].

Best, Philipp

codelilei commented 4 years ago

The model used in our experiments is the "LResNet100E-IR,ArcFace@ms1m-refine-v2" pre-trained model

@jankolf yeah, I got exactly the same weights saved from "LResNet100E-IR,ArcFace@ms1m-refine-v2" downloaded yesterday.

I was mainly confused about the fact that the profile face image could also get a score higher than 0.8, which seemed to be inconsistent with the first impression brought by the distribution figure shown in your another paper https://arxiv.org/abs/2004.01019.

the score distribution of SER-FIQ on Arcface is mostly in the range of 0.86-0.90

To get a more "natural" quality range, you can simply use scaling methods, such as MinMax normalization.

Now it seems to make sense, I didn't notice the starting point of the x-coordinate. I will try more images later and focus on the relative size relation.

Nice work! Thanks for your quick reply!

sunny0315 commented 4 years ago

@jankolf @pterhoer Thanks for your quick reply! At first, I thought that based on this score, we could solve the problem such as pose, occlusions and expressions which learned from the paper https://arxiv.org/pdf/2003.09373.pdf.So I found some representative pictures, and their scores didn't seem to have a strong correlation and can't filter them just by setting a score threshold(probability may lead to misjudgment).Maybe this method can be used to judge whether it is suitable for a recognition system. image

pterhoer commented 4 years ago

@codelilei Thanks for your feedback! If you find any other problems, just contact us.

@sunny0315 The face quality score of SER-FIQ relates to how well the deployed face recognition model can deal with the input image. If your network can deal well with various poses, occlusions, and expressions, SER-FIQ will not produce low quality values. In this cases, I would recommend to use a network that is not robust to such variations. Then SER-FIQ will produce low quality values for images with these variations. (Btw, the same goes for the use of biased networks. If the deployed network is biased, the obtained quality values will contain the same bias as well.)

sunny0315 commented 4 years ago

@pterhoer I see. Thank you very much!

pterhoer commented 4 years ago

@pterhoer I see. Thank you very much!

You are welcome :)

RyanCV commented 4 years ago

Hi @pterhoer

If your network can deal well with various poses, occlusions, and expressions, SER-FIQ will not produce low quality values. In this cases, I would recommend to use a network that is not robust to such variations. Then, what is the purpose of using SER-FIQ? It seems the SER-FIQ is strongly dependent on and positively correlated with the network, right? Then, how reliable the face quality value it predicts?

pterhoer commented 4 years ago

Hi again RyanCV,

if your network can deal well with variations, such as poses, occlussions, and expressions, it will produce relatively stable representations. Stable representations leads to less variations in the stochastic embeddings and thus, to a high robustness and quality estimates.

Best, Philipp