biubug6 / Pytorch_Retinaface

Retinaface get 80.99% in widerface hard val using mobilenet0.25.
MIT License
2.63k stars 774 forks source link

difference of posprocess between your and original mxnet? #34

Closed OPPOA113 closed 4 years ago

OPPOA113 commented 5 years ago

what's the diff of posprocess between pytorch and mxnet after got the output blob ? Is the same way to get the anchors at each output layer? i convert the pth model tried by your code to caffe model, and make infer at this c++ imp https://github.com/clancylian/retinaface. but, i got the wrong result. have you get any idea about this?? thx

biubug6 commented 5 years ago

I provide C++ imp of retinaface in https://github.com/biubug6/Face-Detector-1MB-with-landmark whose posprocess may help you.

OPPOA113 commented 5 years ago

thx for quickly reply. i will try it . if i replace the network inference from ncnn to caffe ,and other all in the right way ,should i get the right result with your provided C++ imp?

biubug6 commented 5 years ago

Yes, all you have to do is adjust the minsize in 'create_anchor_retinaface() ' of https://github.com/biubug6/Face-Detector-1MB-with-landmark/blob/master/Face_Detector_ncnn/FaceDetector.cpp. minsize1 = {16, 32} minsize2 = {64, 128} minsize3 = {256, 512}

OPPOA113 commented 5 years ago

thankyou very!one more question . how should i to deal with the out blob? i found that you had cat all the three output class\box\landmark head togeter individual.so should i need to do as you done? in python code ,we can see `class ClassHead(nn.Module): def init(self,inchannels=512,num_anchors=3): super(ClassHead,self).init() self.num_anchors = num_anchors self.conv1x1 = nn.Conv2d(inchannels,self.num_anchors*2,kernel_size=(1,1),stride=1,padding=0)

def forward(self,x):
    out = self.conv1x1(x)
    **out = out.permute(0,2,3,1).contiguous()**

    **return out.view(out.shape[0], -1, 2)**

class BboxHead(nn.Module): def init(self,inchannels=512,num_anchors=3): super(BboxHead,self).init() self.conv1x1 = nn.Conv2d(inchannels,num_anchors*4,kernel_size=(1,1),stride=1,padding=0)

def forward(self,x):
    out = self.conv1x1(x)
    **out = out.permute(0,2,3,1).contiguous()**

    **return out.view(out.shape[0], -1, 4)**

class LandmarkHead(nn.Module): def init(self,inchannels=512,num_anchors=3): super(LandmarkHead,self).init() self.conv1x1 = nn.Conv2d(inchannels,num_anchors*10,kernel_size=(1,1),stride=1,padding=0)

def forward(self,x):
    out = self.conv1x1(x)
    **out = out.permute(0,2,3,1).contiguous()**
    **return out.view(out.shape[0], -1, 10)**

........ def forward(self,inputs): out = self.body(inputs)

    # FPN
    fpn = self.fpn(out)

    # SSH
    feature1 = self.ssh1(fpn[0])
    feature2 = self.ssh2(fpn[1])
    feature3 = self.ssh3(fpn[2])
    features = [feature1, feature2, feature3]

    **bbox_regressions = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=1)
    classifications = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)],dim=1)
    ldm_regressions = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=1)**

    if self.phase == 'train':
        output = (bbox_regressions, classifications, ldm_regressions)
    else:
        output = (bbox_regressions, F.softmax(classifications, dim=-1), ldm_regressions)
    return output

1、what's the main purpose of permute and view op in class LandmarkHead and BboxHead? is need to do this? out = out.permute(0,2,3,1).contiguous() return out.view(out.shape[0], -1, 10)` and what's correspondence op in caffe? 2、should i have to cat the #torch.cat([self.BboxHeadi for i, feature in enumerate(features)], dim=1)# output togeter in caffe such as picture below? image