Different implementation of PredictionModule

anshkumar commented 2 years ago

Hi, I was going through your code and found that you've used len(cfg.aspect_ratios) as a multiplication factor for bbox_layer, conf_layer and coef_layer:

        self.bbox_layer = nn.Conv2d(256, len(cfg.aspect_ratios) * 4, kernel_size=3, padding=1)
        self.conf_layer = nn.Conv2d(256, len(cfg.aspect_ratios) * self.num_classes, kernel_size=3, padding=1)
        self.coef_layer = nn.Sequential(nn.Conv2d(256, len(cfg.aspect_ratios) * self.coef_dim,
                                                  kernel_size=3, padding=1),
                                        nn.Tanh())

But in the official implementation they are using self.num_priors as a multiplication factor:

            self.bbox_layer = nn.Conv2d(out_channels, self.num_priors * 4,                **cfg.head_layer_params)
            self.conf_layer = nn.Conv2d(out_channels, self.num_priors * self.num_classes, **cfg.head_layer_params)
            self.mask_layer = nn.Conv2d(out_channels, self.num_priors * self.mask_dim,    **cfg.head_layer_params)

why ?

feiyuhuahuo commented 2 years ago

len(cfg.aspect_ratios) == 3, in each output scale, there are 3 anchors with different aspect ratios per point. This is the original Yolact configuration.

anshkumar commented 2 years ago

@feiyuhuahuo Thanks for the reply. I was also making Yolact minimal (but in TensorFlow). I tried to match the code as closely as possible with the original code, but still, I'm not able to match the MAP of the official implementation. With Resnet50 I'm getting 25.6 MAP. Will you be able to provide some feedback or review my code please? link: https://github.com/anshkumar/yolact

feiyuhuahuo / Yolact_minimal

Different implementation of PredictionModule #69