biubug6 / Pytorch_Retinaface

Retinaface get 80.99% in widerface hard val using mobilenet0.25.
MIT License
2.63k stars 774 forks source link

change backbone #8

Closed mehrazi closed 5 years ago

mehrazi commented 5 years ago

@biubug6 hi, thanks for sharing your project. can you provide short wiki for changing backbone? how can I change the backbone and train?

xsacha commented 5 years ago

It's quite easy! See in retinaface.py where the backbone is set.

class RetinaFace(nn.Module):
    def __init__(self, phase = 'train', net = 'mnet0.25', return_layers = {'stage1': 1, 'stage2': 2, 'stage3': 3}):
        super(RetinaFace,self).__init__()
        self.phase = phase
        self.backbone = None
        if net == 'mnet0.25':
            self.backbone = MobileNetV1()
            if True:
                checkpoint = torch.load("model_best.pth.tar", map_location=torch.device('cpu'))
                from collections import OrderedDict
                new_state_dict = OrderedDict()
                for k, v in checkpoint['state_dict'].items():
                    name = k[7:]  # remove module.
                    new_state_dict[name] = v
                # load params
                self.backbone.load_state_dict(new_state_dict)

Change that to a different torch model. Example:

    elif net == 'detnas':
            self.backbone = ShuffleNetV2DetNAS(model_size='VOC_RetinaNet_300M')
            checkpoint = torch.load("VOC_RetinaNet_300M.pkl", map_location=torch.device('cpu'))
            self.backbone.load_state_dict(checkpoint)
            return_layers = {'6': 1, '9': 2, '16': 3}

The return layers just need to be consecutive layers from the model where the size doubles (eg. 32 -> 64 -> 128). Then set the in_channels_stage2 to half of the first layer's output channels.

If the model requires a different input channels than 3 (RGB), then you will need to convert to that number of channels before running your backbone: Add this to your RetinaFace class: self.lin = nn.Conv2d(3, 16, kernel_size=(3,3), stride=(4,4), padding=1) Then call before your backbone like this: out = self.body(self.lin(inputs))

mehrazi commented 4 years ago

@xsacha Hi, I want to use efficientNet as a backbone but I'm confused about what should I do. the efficinetnet repo : https://github.com/lukemelas/EfficientNet-PyTorch there is any point?

xsacha commented 4 years ago

I tried this but there is a big issue with all the efficient net backbones I attempted. One of the operations wasn't very efficient in PyTorch (swish?). It used a lot of memory and wasn't any faster. The one you linked was the one I tested.

You'll notice the author has a version that works better but it cannot be used in the JIT (which is the only way to get a fast PyTorch model).

mehrazi commented 4 years ago

@xsacha thanks for the quick reply. for some reason, I should try this and meet some issues. can you guide me to train by this?

quocnhat commented 4 years ago

Hi @xsacha , For Mobilenet v3, the sizes of the consecutive layer are 16, 24, 40, 48, ... which are not the double size of the last layer. So how can we return layers for that? If I modified the net, It works but low accuracy.