dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5k stars 1.32k forks source link

Backbone for yolact++ #335

Open Rm1n90 opened 4 years ago

Rm1n90 commented 4 years ago

Hey @dbolya, Great work! I would like to implement a new backbone for yolact++. I have looked at your ResNetBackbone (some issues too) to have an idea about how to implement a new backbone. I have some questions regarding implementing Backbone for yolact++. I would be appreciated if you can answer me.

Thanks!

dbolya commented 4 years ago
  1. Your new backbone doesn't need DCN, just remember to change the backbone in the yolact++ config.
  2. init_backbone takes in the image-net pretrained weights file and loads it into the backbone. If you're training from scratch [though I don't recommend it], you can just leave the function blank.

To write this function, print out the keys for your backbone implementation:

for k in self.state_dict():
    print(k)

and crossreference that with your pretrained weight file:

state_dict = torch.load(path)
for k in state_dict:
    print(k)

If the two lists of keys match perfectly, you don't need to worry about anything and you can just call self.load_state_dict(state_dict) normally.

If there's some difference in keys, you need to have your loaded state_dict match the keys in self.state_dict(), so that's what transform_keys and the other init_backbone functions are doing. So just make sure they match and you're good.

Auth0rM0rgan commented 4 years ago

@dbolya and @Rm1n90, Sorry for jumping between your conversation! I'm implementing a new backbone for yolact++ too and have some questions about hyperparameters that need to be tuned for a new backbone. what should be the values for these ones in the new backbone to get the best performance (selected_layers, pred_scales, pred_aspect_ratios) and would you please explain the intuition behind? Also, is there any other values need to be set for a new backbone?

Thanks a lot!

dbolya commented 4 years ago

@Auth0rM0rgan The most important parameter would be selected_layers. The rest, you should probably keep the same (unless you want to tune these yourself).

selected_layers determines which backbone layers we should add prediction heads to. The indices of this is determined by what you output during the forward pass. So for instance if your forward pass looks like:

def forward(x):
    a = self.conv1(x)
    b = self.conv2(a)
    c = self.conv3(b)
    return (a, b, c)

A selected_layers of [1, 2] would add prediction heads to b and c in the forward function.

When not using FPN (i.e., SSD mode) and you specify a index larger than 2 in this case, the backbone will add more layers to compensate using the add_layer function. If you're using FPN, which I assume you are then you can just ignore this function.

Then the way you add extra layers with FPN is instead of directly selecting them, you set the fpn.num_downsample parameter to the number of layers you want to add.

As for what parameters you should use, both YOLACT and YOLACT++ use 5 prediction heads, so if you don't want to do any extra tuning you should select 5 layers. Both versions select 3 backbone layers and add 2 downsample layers. If you want to be similar to that, you can just select the last 3 "blocks" of your backbone (if there's any stride 2 convolution, output the activations right before that stride), and add 2 FPN layers like in the current configs.

Alternatively (and preferably), you can try to match the scales of the layers with the layers of Resnet. For reference, the layers we select have resolutions for images of size 550x550 of:

conv3 (P3): 1/  8 the image size (69x69)
conv4 (P4): 1/ 16 the image size (35x35)
conv5 (P5): 1/ 32 the image size (18x18)
  .   (P6): 1/ 64 the image size ( 9x 9)
  .   (P7): 1/128 the image size ( 5x 5)

Just match your selected layers with those dimensions and if you need extra layers like with P6 and P7, add them with fpn.num_downsample.

Also by default, we also put protonet on the first of these selected layers (in this case the 69x69 P3), but you can change that too with mask_proto_src.


As an aside, I was about to say you can look at the documentation in config.py for the rest of the parameters, but it looks like I forgot to write documentation for backbone parameters...

Auth0rM0rgan commented 4 years ago

@dbolya Thanks for all the information!

ahkarami commented 4 years ago

Dear @dbolya, would you please add the MobileNet backbone to the pre-trained models?

dbolya commented 4 years ago

@ahkarami It's a good thing to have, so I'll add it to my TODO list (along with efficientnet).

Auth0rM0rgan commented 4 years ago

@dbolya, Would you please add the Vovnet and Vovnet2 backbone? It seems these backbones are better than MobileNet, EffecientNet or even ResNet.

elfpattern commented 4 years ago

@dbolya ,I have replaced hrnet18 with resnet in yolact, but it does not improve,,emmmm

dbolya commented 4 years ago

@Auth0rM0rgan I'll add it to the list but this is starting to be too much now.

@elfpattern ¯\_(ツ)_/¯

elfpattern commented 4 years ago

@dbolya, On my dataset, I have done comparison experiments between mmdetection and yolact , if using mmdetection, the bbox mAP of hrnet and resnet50 are 78 and 73, if using yolact, the bbox mAP of hrnet are 74 and 75, emmm....,

abhigoku10 commented 4 years ago

@elfpattern hey did u check with yolact++ hrnet theoritically you will getting better mAP compared to yolact hrnet . good that you were able to integrate hrnet if possible can you share

elfpattern commented 4 years ago

@abhigoku10 ok, I will check some small errors and share it

abhigoku10 commented 4 years ago

@elfpattern thansk for the response and thanks for sharing

elfpattern commented 4 years ago

@abhigoku10 ,please give me your email , I will send you , now the mAP of hrnet18 is lower than resnet50, you can try .

Auth0rM0rgan commented 4 years ago

Hey @elfpattern, would you please share it here so everyone can use it? Thanks!

elfpattern commented 4 years ago

@Auth0rM0rgan ok.

abhigoku10 commented 4 years ago

@elfpattern abhigoku10@gmail.com is the mail id , thanks for sharing !!1

Rm1n90 commented 4 years ago

what about the FPS compare to ResNet50? thanks a lot!

elfpattern commented 4 years ago

url: https://github.com/elfpattern/yolact run: python3 train.py --config=yolact_resnet18_config pretrained model: search repo: mmdetection

abd-gang commented 1 year ago

Hi @elfpattern ,

Have you also tried with resnet18? In above command, you have mentioned --config=yolact_resnet18_config but I can see your config.py does not contain resnet18 configuration. Also, what mAPs and FPS you were able to get please let me know.

Thanks