Closed kalyo-zjl closed 7 years ago
Training with normalizing weight only is tricky. I can train it if I fine-tune the model. When I train it from scratch, the weight will soon be Nan.
BTW, you get 99% when normalizing the feature only by using CASIA-Webface? I got worse results when I tried it.
What do you mean by fine-tuning? fine-tuning from vanilla softmax?
Yes, I get 99% after PCA, trained on CASIA-Webface. I also add a scale layer after normalizing feature. Did you learn the scale parameter? If you fixed the scale parameter, you may get 99%.
BTW, using your aligned LFW, I can get 99.2167% using your norm-face model. but I can only get less than 99.1% on my self-aligned LFW using the same model you provide. I use your detection and alignment code here https://github.com/happynear/FaceVerification/blob/master/dataset/general_align.m without modifying parameter settings just modifying some dataset paths.
Is there any difference with you alignment steps?
I mean fine-tuning from the center face model provided by Yandong (https://github.com/ydwen/caffe-face).
I also used this code to get aligned images, but maybe some change in parameters. I can't remember clearly what I changed. Are there any images that MTCNN can't find a face in it?
OK, but the center face already can get 99% which I mean the feature extraction parameters are already good enough. Maybe I will try to fine-tune from vanilla softmax which can got ~98% in my experiment. I will let you know if it works or not.
Sorry to hear that. In my case, all the images from LFW can find a face. BTW, there are roughly 2000+ images from CASIA-webface which can't find a face, I just exclude it from training set.
It seems doesn't work well with only weight normalization. After finetuning from vanilla softmax, the best result I can get is only ~98.2%, no much performance gain from original model.
Ah, my results are similar, it is no better than original model, but also no less than.
But in my experiment, normalizing feature only would lead to worse results than original model ( trained by softmax + center loss).
O.., OK. Besides, I get 99% when using feature normalization only, and it is trained from scratch, not finetuning.
That is also a impressive result. I can also get 99% using C-contrastive loss from scratch, but fail to get good results using normalized softmax loss.
Hi @happynear , I noticed that in table 2, normalization will be the key factor to promote the performance. So I tried to replicate your experiment of feature only and weight only and both normalization. For now, I can get 99% when using feature only normalization. But, with weight only, the result is only ~97%, even much worse than No Normalization. The details about my Weights Normalization is as below.
layer { name: "id_weight_ip" type: "Parameter" top: "id_weight_ip" param { lr_mult: 1 decay_mult: 0 } parameter_param { shape { dim: 10572 dim: 512 } blob_filler { type: "gaussian_unitball" } } } layer { name: "id_weight_ip_normalize" type: "Normalize" bottom: "id_weight_ip" top: "id_weight_ip_normalize" } layer { name: "id_weight_ip_scale" type: "Scale" bottom: "id_weight_ip_normalize" top: "id_weight_ip_scale" top: "SCALE" param { lr_mult: 0 decay_mult: 0 } scale_param { num_axes: 0 filler { value: 5 } bias_term: false } }
############## softmax loss ############### layer { name: "fc6" type: "InnerProduct" bottom: "fc5" bottom: "id_weight_ip_scale" top: "fc6" inner_product_param { num_output: 10572 weight_filler { type: "xavier" } bias_term: false } } layer { name: "softmax_loss" type: "SoftmaxWithLoss" bottom: "fc6" bottom: "label" top: "softmax_loss" }
If the scale is learned, the loss always diverges since scale will grow to Nan. So I fixed the scale, varies from 1 to 15. However, none of them works well. Could you please give me some advice about the settings of weight normalization? Thanks in advance.