facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.57k stars 2.45k forks source link

How to use detr with another backbone ( mobilenet,shufflenet)? #154

Open anhtt20172948 opened 4 years ago

lessw2020 commented 4 years ago

You would need to modify backbone.py (under detr/models) and insert the backbone you are after. Beyond just loading the desired backbone you'd need to freeze the BN of the backbone. I tried with nest and didn't get great results but not clear I froze the BN properly.

Here's some starter code to give an idea: ` class Backbone(BackboneBase): """ResNet backbone with frozen BatchNorm.""" def init(self, name: str, train_backbone: bool, return_interm_layers: bool, dilation: bool):

    freeze_bn_affine=True

    backbone=torch.hub.load('zhanghang1989/ResNeSt', 'resnest50', pretrained=True)

    #freeze bn
    '''for m in backbone.modules():
            if isinstance(m, nn.BatchNorm2d):
                m.eval()
                if freeze_bn_affine:
                    m.weight.requires_grad = False
                    m.bias.requires_grad = False
    '''

    #print(f"--> ** Nest50 Backbone loaded.  Todo - make this a param...")
    # backbone = getattr(torchvision.models, name)(
    #    replace_stride_with_dilation=[False, False, dilation],
   #     pretrained=is_main_process(), norm_layer=FrozenBatchNorm2d)
    #156 
    num_channels = 512 if name in ('resnet18', 'resnet34') else 2048
    super().__init__(backbone, train_backbone, num_channels, return_interm_layers)`
anhtt20172948 commented 4 years ago

thank for your reply. I'm training detr with backbone mobilenet_v2. This is my sourcecode written further in backbone.py. Do you have another idea? `class Mobilenet_BackboneBase(nn.Module): def init(self, backbone: nn.Module, train_backbone: bool, num_channels: int, return_interm_layers: bool): super().init() return_layers = {'features': "18"} self.body = IntermediateLayerGetter(backbone, return_layers=return_layers) self.num_channels = num_channels

def forward(self, tensor_list: NestedTensor):
    xs = self.body(tensor_list.tensors)
    out: Dict[str, NestedTensor] = {}
    for name, x in xs.items():
        m = tensor_list.mask
        assert m is not None
        mask = F.interpolate(m[None].float(), size=x.shape[-2:]).to(torch.bool)[0]
        out[name] = NestedTensor(x, mask)
    return out

class Mobilenet_backbone(Mobilenet_BackboneBase): def init(self, name: str, train_backbone: bool, return_interm_layers: bool, dilation: bool): backbone = getattr(torchvision.models, name)( pretrained=is_main_process()) num_channels = 1280 super().init(backbone, train_backbone, num_channels, return_interm_layers) `

zhiqwang commented 4 years ago

I write a similar method to get the intermediate layer of torchvision's mobilenet_v2. According to IntermediateLayerGetter implementation description. It's important to set backbone = mobilenet_v2(pretrained=True).features here, otherwise model.features[18] cannot be returned. And then set return_layers = {"18": "0"}.

And my implementation is here.

alcinos commented 4 years ago

Note that getting intermediate layers is not required if you don't care about panoptic. If you want to do only detection, you can only return the output of the backbone (in a dictionary)

Hezey commented 1 year ago

thank for your reply. I'm training detr with backbone mobilenet_v2. This is my sourcecode written further in backbone.py. Do you have another idea? `class Mobilenet_BackboneBase(nn.Module): def init(self, backbone: nn.Module, train_backbone: bool, num_channels: int, return_interm_layers: bool): super().init() return_layers = {'features': "18"} self.body = IntermediateLayerGetter(backbone, return_layers=return_layers) self.num_channels = num_channels

def forward(self, tensor_list: NestedTensor):
    xs = self.body(tensor_list.tensors)
    out: Dict[str, NestedTensor] = {}
    for name, x in xs.items():
        m = tensor_list.mask
        assert m is not None
        mask = F.interpolate(m[None].float(), size=x.shape[-2:]).to(torch.bool)[0]
        out[name] = NestedTensor(x, mask)
    return out

class Mobilenet_backbone(Mobilenet_BackboneBase): def init(self, name: str, train_backbone: bool, return_interm_layers: bool, dilation: bool): backbone = getattr(torchvision.models, name)( pretrained=is_main_process()) num_channels = 1280 super().init(backbone, train_backbone, num_channels, return_interm_layers) `

Have you successfully trained with MobileNet-v2? I'm having some problems...

XcloudFance commented 1 year ago

I am also trying to connect DeTR with CenterNet as the backbone. Even though it successfully run, I still got some problems with intermediateLayerGetter. Do I need to map all of layers or just few of them? like the decoder part of net...

Anyway the result is not converging as I expected. Still needed more adjustment

AAArpan commented 1 year ago

@alcinos @zhiqwang Can you please tell me what should be the input size to the transformer after the backbone? As I am changing the backbone, will I have to make sure that it will be the same as it goes from resnet??

XLBL2333 commented 8 months ago

A freshman is trying to train DETR with his own backbone. But he doesnt know how to modify backbone of DETR,anyone who help him please,pleaseeeeeeeee.......[cry.jpg][helpless.jpg]