TreB1eN / InsightFace_Pytorch

Pytorch0.4.1 codes for InsightFace
MIT License
1.72k stars 418 forks source link

maybe few differences with mxnet #37

Open ruiming46zrm opened 5 years ago

ruiming46zrm commented 5 years ago

Hi, @TreB1eN , I'm learning your codes for few days ,it's very nice work, but compared the network with mxnet version: 2 key differences found as following: 1 . in bottleneck, res layer, there might be a bn between first conv and prelu as mxnet 2 . your method of short_cut :

       if depth == in_channel:
           self.shortcut_layer = MaxPool2d(1, stride)
       else:
           self.shortcut_layer = Sequential(
               Conv2d(in_channel, depth, (1, 1), stride ,bias=False), BatchNorm2d(depth))

lead the first get_block, first unit's short_cut become MaxPool2d(1,2) but not Conv+BN because of the input data channel=64=depth, that's diffferent with mxnet codes. maybe use :

          if stride == 1:
                ....
          else:
                ....

I don't know whether it would be the reason of low megaface results, and other differences like : mxnet seemed no use se model to train res50, why do we use? and the drop_ratio = 0.4 in mxnet but 0.6 in torch.
do you have any suggestions?

I will change the above and train

TreB1eN commented 5 years ago

Yes, I noticed the difference a while ago, the drop_ratio is a mistake, the choice of se_model is just random.

Please issue a PR if you can get better performace

ruiming46zrm commented 5 years ago

by modified network , large batch_size = 384 (4 GPUs), lr = 0.1 , milestone = [3,6,9,12],drop ratio = 0.4 . train ir_se50, get :

   lfw accrancy : 99.78%
   agedb-30 : 97.58%
   megaface rank 1 result : 96.4%

I think the drop ratio and batch size affect a lot

bnu-wangxun commented 5 years ago

@ruiming46zrm can you share the modification you made in detail. Especially the network.

ruiming46zrm commented 5 years ago

@bnulihaixia as above : 1. shortcut; 2:add bn .
you may try to use large batch and large lr to get better result

bnu-wangxun commented 5 years ago

@ruiming46zrm thanks for your response.

ghost commented 4 years ago

Hi, Do we need to normalize the images of Facescrub and Megaface before feeding them to the model to get features? Thank you for reading.

konioyxgq commented 4 years ago

Did you do it on purpose? Why did you modify it like this? Or was it unintentional? @TreB1eN