Open Rm1n90 opened 4 years ago
init_backbone
takes in the image-net pretrained weights file and loads it into the backbone. If you're training from scratch [though I don't recommend it], you can just leave the function blank.To write this function, print out the keys for your backbone implementation:
for k in self.state_dict():
print(k)
and crossreference that with your pretrained weight file:
state_dict = torch.load(path)
for k in state_dict:
print(k)
If the two lists of keys match perfectly, you don't need to worry about anything and you can just call self.load_state_dict(state_dict)
normally.
If there's some difference in keys, you need to have your loaded state_dict
match the keys in self.state_dict()
, so that's what transform_keys
and the other init_backbone
functions are doing. So just make sure they match and you're good.
@dbolya and @Rm1n90, Sorry for jumping between your conversation!
I'm implementing a new backbone for yolact++ too and have some questions about hyperparameters that need to be tuned for a new backbone. what should be the values for these ones in the new backbone to get the best performance (selected_layers, pred_scales, pred_aspect_ratios
) and would you please explain the intuition behind? Also, is there any other values need to be set for a new backbone?
Thanks a lot!
@Auth0rM0rgan The most important parameter would be selected_layers
. The rest, you should probably keep the same (unless you want to tune these yourself).
selected_layers
determines which backbone layers we should add prediction heads to. The indices of this is determined by what you output during the forward pass. So for instance if your forward pass looks like:
def forward(x):
a = self.conv1(x)
b = self.conv2(a)
c = self.conv3(b)
return (a, b, c)
A selected_layers
of [1, 2]
would add prediction heads to b
and c
in the forward function.
When not using FPN (i.e., SSD mode) and you specify a index larger than 2 in this case, the backbone will add more layers to compensate using the add_layer
function. If you're using FPN, which I assume you are then you can just ignore this function.
Then the way you add extra layers with FPN is instead of directly selecting them, you set the fpn.num_downsample
parameter to the number of layers you want to add.
As for what parameters you should use, both YOLACT and YOLACT++ use 5 prediction heads, so if you don't want to do any extra tuning you should select 5 layers. Both versions select 3 backbone layers and add 2 downsample layers. If you want to be similar to that, you can just select the last 3 "blocks" of your backbone (if there's any stride 2 convolution, output the activations right before that stride), and add 2 FPN layers like in the current configs.
Alternatively (and preferably), you can try to match the scales of the layers with the layers of Resnet. For reference, the layers we select have resolutions for images of size 550x550 of:
conv3 (P3): 1/ 8 the image size (69x69)
conv4 (P4): 1/ 16 the image size (35x35)
conv5 (P5): 1/ 32 the image size (18x18)
. (P6): 1/ 64 the image size ( 9x 9)
. (P7): 1/128 the image size ( 5x 5)
Just match your selected layers with those dimensions and if you need extra layers like with P6 and P7, add them with fpn.num_downsample
.
Also by default, we also put protonet on the first of these selected layers (in this case the 69x69 P3), but you can change that too with mask_proto_src
.
As an aside, I was about to say you can look at the documentation in config.py
for the rest of the parameters, but it looks like I forgot to write documentation for backbone parameters...
@dbolya Thanks for all the information!
Dear @dbolya,
would you please add the MobileNet
backbone to the pre-trained models?
@ahkarami It's a good thing to have, so I'll add it to my TODO list (along with efficientnet).
@dbolya, Would you please add the Vovnet and Vovnet2 backbone? It seems these backbones are better than MobileNet, EffecientNet or even ResNet.
@dbolya ,I have replaced hrnet18 with resnet in yolact, but it does not improve,,emmmm
@Auth0rM0rgan I'll add it to the list but this is starting to be too much now.
@elfpattern ¯\_(ツ)_/¯
@dbolya, On my dataset, I have done comparison experiments between mmdetection and yolact , if using mmdetection, the bbox mAP of hrnet and resnet50 are 78 and 73, if using yolact, the bbox mAP of hrnet are 74 and 75, emmm....,
@elfpattern hey did u check with yolact++ hrnet theoritically you will getting better mAP compared to yolact hrnet . good that you were able to integrate hrnet if possible can you share
@abhigoku10 ok, I will check some small errors and share it
@elfpattern thansk for the response and thanks for sharing
@abhigoku10 ,please give me your email , I will send you , now the mAP of hrnet18 is lower than resnet50, you can try .
Hey @elfpattern, would you please share it here so everyone can use it? Thanks!
@Auth0rM0rgan ok.
@elfpattern abhigoku10@gmail.com is the mail id , thanks for sharing !!1
what about the FPS compare to ResNet50? thanks a lot!
url: https://github.com/elfpattern/yolact run: python3 train.py --config=yolact_resnet18_config pretrained model: search repo: mmdetection
Hi @elfpattern ,
Have you also tried with resnet18? In above command, you have mentioned --config=yolact_resnet18_config but I can see your config.py does not contain resnet18 configuration. Also, what mAPs and FPS you were able to get please let me know.
Thanks
Hey @dbolya, Great work! I would like to implement a new backbone for yolact++. I have looked at your ResNetBackbone (some issues too) to have an idea about how to implement a new backbone. I have some questions regarding implementing Backbone for yolact++. I would be appreciated if you can answer me.
Thanks!