What is input of pre-trained model?

ZhangYuanhan-AI / CelebA-Spoof

[ECCV2020] A Large-Scale Face Anti-Spoofing Dataset

531 stars 92 forks source link

What is input of pre-trained model? #39

Closed luan1412167 closed 2 years ago

luan1412167 commented 3 years ago

Thanks for your repo. A glance at your code, It seems like the input of the model is a list of cropped RGB faces. Is it right?

luan1412167 commented 3 years ago

I have tested in my webcam, it is always a higher score for class 1. (batchsize=5)

[[4.2142319e-06 9.9999583e-01] [1.2677202e-06 9.9999869e-01] [1.4011762e-06 9.9999857e-01] [1.9214897e-06 9.9999809e-01] [2.1527359e-05 9.9997842e-01]]

luan1412167 commented 3 years ago

What is my wrong?

kadirbeytorun commented 3 years ago

You need a face detector, then you crop your facial area and resize the image to 224x224, then use it as input to aenet network

luan1412167 commented 3 years ago

@kadirbeytorun thanks for your answer. I have used retinaface and cropped only facial area. Then I resize to 224x224 but model is always return a high probability for class 1. I think that the model is not good generalization

kadirbeytorun commented 3 years ago

Could you share your output please?

Model generalization is pretty good actually, though it gets affected by lighting conditions heavily, it works well with high quality cameras. Even on embedded systems

luan1412167 commented 3 years ago

@kadirbeytorun I just tested in my webcam with 1280x720 resolution. The output is always near this for live and spoof case. Can you share the script your test and model. Thank you

kadirbeytorun commented 3 years ago

Try using softmax on your output from network. You can use the softmax function from scipy or pytorch

Also your output is little confusing for me. can you try to send two different cases separetely? like: spoofoutput=..... liveoutput=....

luan1412167 commented 3 years ago

sorry you, I'm not in my lab, I will sent it tomorrow. Also, The above output is for 5 real faces in a batch. For 5 fake faces in a batch the result is near that. Is your script in intra_dataset_code folder?

luan1412167 commented 3 years ago

Hi @kadirbeytorun ,

output fake image [[0.99551105 0.00448893]]
output real image [[0.9974425 0.00255755]]

the output seems like is same. Can you check with the two above images. Thanks

kadirbeytorun commented 3 years ago

Do you normalize your input before sending it to the network mate? You need to divide it by 255 as I remember

luan1412167 commented 3 years ago

Thanks for your answer, input model seems like is an unit8 image with the script in the intra_dataset_code folder. I divided it by 255 and got the error raise TypeError("Cannot handle this data type: %s, %s" % typekey) TypeError: Cannot handle this data type: (1, 1, 3), <f8

kadirbeytorun commented 3 years ago

Can you please share your demo script? I am not the owner of this repo, I dont know the code that much but maybe I can help better if you show your script

luan1412167 commented 3 years ago

this is my script I have modified from this file in the repo. Thank you

kadirbeytorun commented 3 years ago

You need to give permission for that file in your drive

luan1412167 commented 3 years ago

sorry you for my mistake, get that file here

kadirbeytorun commented 3 years ago

Ohh you are already using the predict function of the repository, so no need for any preprocesing or post processing.

I dont see anything wrong with your script, only thing that comes to mind is your camera quality, lighting quality etc.

Try taking some high quality photos and try with them. Camera quality, lighting conditions affect the result massively like I said.

For example, with good lighting it works well even with NIR camera, but when lighting is low it automatically assumes the frame to be fake. Also in your trials, dont wear surgical mask, network confuses that with face-mask attack.