ZhangYuanhan-AI / CelebA-Spoof

[ECCV2020] A Large-Scale Face Anti-Spoofing Dataset
535 stars 92 forks source link

Question about the liveness data in CelebA_Spoof dataset. #5

Closed Zhengtq closed 2 years ago

Zhengtq commented 4 years ago

As the live face data in the original CelebA dataset is mostly crawled from the website which inevitably contains many computer-enhanced pictures(ps picture). And those faces don't look very real compared to the picture being shot by camera. Visually those faces looks more like spoof face than real face. So I doubt whether it is proper to treat that unreal face as live in your dataset. The pictures below is from live data which is more like spoof faces visually. 505815

552109

508796

497919

ZhangYuanhan-AI commented 4 years ago

In the context of face anti-spoofing(Presentation Attack Detection) (Different from Forensics. e.g. Deepfake). Most of computer-enhanced pictures can still be considered as Live.

ghost commented 4 years ago

177F7E0F93D06D161C0AEF89055419A7 @Davidzhangyuanhan Did you do data clean before data collection?

ZhangYuanhan-AI commented 4 years ago

We have done it. But since the large-scale of CelebA-Spoof. We can not make sure we clean up very thoroughly

hamhanry commented 3 years ago

@Davidzhangyuanhan but, the evaluation and training that you had showed in the experiment table is all using celebA-Spoof dataset without any further cleaning right?

ZhangYuanhan-AI commented 3 years ago

@Davidzhangyuanhan but, the evaluation and training that you had showed in the experiment table is all using celebA-Spoof dataset without any further cleaning right?

Yes.

hamhanry commented 3 years ago

@Davidzhangyuanhan thanks for your answer. However i am trying to evaluate given pretrained model, but the result is quite different with whats written in the paper.

About the training hyper-parameter, the weight loss on each semantic, illum and so on, is it a multiplier for each batch? and also could you share the training hyper parameters here such as batch size, SGD parameter such as weight decay and other?

Thank you

ZhangYuanhan-AI commented 3 years ago

@Davidzhangyuanhan thanks for your answer. However i am trying to evaluate given pretrained model, but the result is quite different with whats written in the paper.

About the training hyper-parameter, the weight loss on each semantic, illum and so on, is it a multiplier for each batch? and also could you share the training hyper parameters here such as batch size, SGD parameter such as weight decay and other?

Thank you

Hi:

  1. But the result is quite different from what's written in the paper For which benchmark? The released model is for the intra-dataset test.
  2. Is it a multiplier for each batch? I don't really understand this question, could you please explain it a litter bit? 3.batch size, SGD parameter such as weight decay Batch size:128*8(GPU) SGD weight decay:10^(-5) or 10^(-4)
hamhanry commented 3 years ago

@Davidzhangyuanhan

Thanks for the answer

  1. The i mentioned is from intra-dataset test as well. i am trying to evaluate using the same metrics mentioned in the paper. However the numbers in the metrics quite different. Wonder, might be in evaluate do you have things to highlight?
  2. The lamda value. is it the lamda is multiplied for each batch or direct you initialized the weight when initialized the loss function?

thank you

ZhangYuanhan-AI commented 3 years ago

@Davidzhangyuanhan

Thanks for the answer

  1. The i mentioned is from intra-dataset test as well. i am trying to evaluate using the same metrics mentioned in the paper. However the numbers in the metrics quite different. Wonder, might be in evaluate do you have things to highlight?
  2. The lamda value. is it the lamda is multiplied for each batch or direct you initialized the weight when initialized the loss function?

thank you

  1. Which value is different exactly?
  2. direct you initialized the weight when initialized the loss function
hamhanry commented 3 years ago

@Davidzhangyuanhan by using the your published pre-trained model and evaluate it using intra-dataset celeba-spoof, here are the result that i got AUC : 97.2475 EER : 2.75246 APCER : 10.9327 BPCER : 0.13702 ACER : 5.5348 the way i calculated those metrics by using these references : http://chalearnlap.cvc.uab.es/challenge/33/track/33/metrics/

Once i perform the training, somehow the model has quite high numbers of False Positives and False Negatives. wonder i missed the highlight either in training and evaluation.

The evaluation of your pre-trained model supposed to be round the number you had put it on the paper. somehow mine quite different.

hamhanry commented 3 years ago

@Davidzhangyuanhan : could you share the details of train and evaluation transform performed to the input image?

hamhanry commented 3 years ago

@Davidzhangyuanhan Hi, i would like to post my prev question again, afraid you do not notice it. could you share your data augmentation setup for training? thank you

ZhangYuanhan-AI commented 3 years ago

@Davidzhangyuanhan Hi, i would like to post my prev question again, afraid you do not notice it. could you share your data augmentation setup for training? thank you

Wish the information below(which I copy from the e-mail between me and another researcher) can give you the information you need. Besides, I also detail these information: For the data preprocessing pipeline, I'm currently using Training: (a) Face Cropped Image -> (b) Resize(size=224) -> (d) ColorJitter(brightness=0, contrast=0, saturation=1, hue=0) Testing: (a) Face Cropped Image -> (b) Resize(size=224) -> In other words, no crop-related manipulation. -----***------ E-mail: Q 1: Can you share the list of transforms that you use during training and testing same as I shown above? I just want to make sure the data pipeline is aligned.

No other transformer except for ColorJitter. Normalization might be useful but is not essential.

Q 2: Can you share the detailed training receipt?

(a) What's the learning rate of each branch of your multi-task AENet?

The same as I detail this information in paper. But recently, I try Adam optimizer with 0.001 LR, it perform a little bit better.

Q 3:What kind of learning rate scheduler did you use for each branch? (e.g. Cosine, Multi-Step)

I find Multi-step is not very important in my setting. (Batch size = 1024) Therefore, I don't use any step at all.

Q 4: Did you use learning rate warm-up?

No warm up

Q 5:Did you try some imagenet training tricks such as mix-up, no bias decay, zeros gamma? No.

hamhanry commented 3 years ago

@Davidzhangyuanhan Hi, i would like to post my prev question again, afraid you do not notice it. could you share your data augmentation setup for training? thank you

Wish the information below(which I copy from the e-mail between me and another researcher) can give you the information you need. Besides, I also detail these information: For the data preprocessing pipeline, I'm currently using Training: (a) Face Cropped Image -> (b) Resize(size=224) -> (d) ColorJitter(brightness=0, contrast=0, saturation=1, hue=0) Testing: (a) Face Cropped Image -> (b) Resize(size=224) -> In other words, no crop-related manipulation. -----***------ E-mail: Q 1: Can you share the list of transforms that you use during training and testing same as I shown above? I just want to make sure the data pipeline is aligned.

No other transformer except for ColorJitter. Normalization might be useful but is not essential.

Q 2: Can you share the detailed training receipt?

(a) What's the learning rate of each branch of your multi-task AENet?

The same as I detail this information in paper. But recently, I try Adam optimizer with 0.001 LR, it perform a little bit better.

Q 3:What kind of learning rate scheduler did you use for each branch? (e.g. Cosine, Multi-Step)

I find Multi-step is not very important in my setting. (Batch size = 1024) Therefore, I don't use any step at all.

Q 4: Did you use learning rate warm-up?

No warm up

Q 5:Did you try some imagenet training tricks such as mix-up, no bias decay, zeros gamma? No.

Dear @Davidzhangyuanhan , thanks for your reply.

i have tried the config params that you had shared in the previous post. However in order to generate the stable model that you produced, i still can replicate it. actually, i have tried to generate Reflection and Depth as well using PR Net.

But as for now, thank you for your information. i will check again my code whether, i might miss something in the training code. thanks

eesaeedkarimi commented 3 years ago

177F7E0F93D06D161C0AEF89055419A7 @Davidzhangyuanhan Did you do data clean before data collection?

Despite the small number of mistakes in live samples, when I tried the pre-trained model with my webcam or recorded videos of my cellphone It seems that the model is biases on spoof samples. In my experiments almost no spoof samples are classified as real. But more than half of real videos are classified as spoof with a high score.