feiyuhuahuo / Yolact_minimal

Minimal PyTorch implementation of YOLACT.
237 stars 70 forks source link

Converting Yolact to anchor free detector #54

Open abhigoku10 opened 2 years ago

abhigoku10 commented 2 years ago

@feiyuhuahuo hi i am trying to convert yolact to anchor free detection ie fcos , i have made the required code modifications and able to train the model but have following error with the current implementation

  1. in the train_aug() ur calling this function to_01_box(img.shape[:2], boxes) is this required even for anchor free based approach

  2. when i train a model on small set i am getting the following graph is it because of less data i am getting like this curve or any error in the implemetation image

  3. when i do an evaluation for the validation set ie evaluate(net.module, cfg, step) i get the following error "type(self).name, name)) AttributeError: 'Yolact' object has no attribute 'module'"

Please do share ur thoughts and Thanks in advance

feiyuhuahuo commented 2 years ago
  1. I can't tell if it is needed. This function scales the box coordinates to 0~1 range. It depends on how you organize the box prediction way.
  2. I also can't tell whether the loss curve has any error.
  3. This error usually comes because of the DDP wrapping. If a model is wraped by distributed data parallel, the model state_dict has a 'module' in the front of the keys.
abhigoku10 commented 2 years ago

@feiyuhuahuo thanks for the response, i double checked the theory part of anchor free based (fcos) and found the difference

Yolact -- backbone resnet50 x.shape torch.Size([1, 3, 544, 544]) outs[0].shape torch.Size([1, 256, 136, 136]) outs[1].shape torch.Size([1, 512, 68, 68]) outs[2].shape torch.Size([1, 1024, 34, 34]) outs[3].shape torch.Size([1, 2048, 17, 17])

FCOS- backbone resnet50 FCOS x.shape torch.Size([1, 3, 1088, 832]) C3.shape torch.Size([1, 512, 136, 104]) C4.shape torch.Size([1, 1024, 68, 52]) C5.shape torch.Size([1, 2048, 34, 26])

Yolact-FPN outs[0].shape torch.Size([1, 256, 68, 68]) outs[1].shape torch.Size([1, 256, 34, 34]) outs[2].shape torch.Size([1, 256, 17, 17]) outs[3].shape torch.Size([1, 256, 9, 9]) outs[4].shape torch.Size([1, 256, 5, 5])

FCOS-FPN P3.shape torch.Size([1, 256, 136, 104]) P4.shape torch.Size([1, 256, 68, 52]) P5.shape torch.Size([1, 256, 34, 26]) P6.shape torch.Size([1, 256, 17, 13]) P7.shape torch.Size([1, 256, 9, 7])

is this the issue y i am getting different results because of the input image size . Please share your thoughts on this

feiyuhuahuo commented 2 years ago

FCOS and Yolact downsample the original image by same strides. So the feature map shape certainly differs from different input image size.

abhigoku10 commented 2 years ago

@feiyuhuahuo thanks for the response, the input image size is also different, 1.how will yolact respond when used the similar input size and backbone feature sizes ? 2.In yolact y do we have an image size of 544 ie a square one and dnot rectangle one?

feiyuhuahuo commented 2 years ago
  1. The feature map shape will be close to that of FCOS's, but not equal, due to some padding opeartions.
  2. This project support rectangle images too.
abhigoku10 commented 2 years ago

@feiyuhuahuo thanks for the response , i debugged to find the difference in implementation and below r my observations

  1. the input image size in fcos is minimum 800 but in yolact its set to 544
  2. THe feature map wxh of fcos starts with 136x104 and ends with 9x7 but with yolact its 68x68 to 5x5
  3. The strides which are used in fcos are strides = [8, 16, 32, 64, 128] which is similar to present in yolact "fpn_fm_shape = [math.ceil(cfg.img_size / stride) for stride in (8, 16, 32, 64, 128)]"
  4. In Fcos only flip and padding augmentations are done but in yolact many augmentations are done but i commented the t_0_1box () ie normalization of the bounding box Can you share your thoughts on these points
feiyuhuahuo commented 2 years ago

Yes, you get it right.

abhigoku10 commented 2 years ago

@feiyuhuahuo which part of the code can we convert the input image size to the 800 in the code currently its resizing to 544 resolution

feiyuhuahuo commented 2 years ago

If you want to do resizing like that of FCOS does, you should implement it by yourself. It's some kind of complicated. If you want to use size of 800*800, use --img_size=800

abhigoku10 commented 2 years ago

@feiyuhuahuo in your implementation the scales and fpn shapes are defaults based on the image size ie 544 so if input image size is changed it would not work right "fpn_fm_shape = [math.ceil(cfg.img_size / stride) for stride in [8, 16, 32, 64, 128]] self.scales = [int(cfg.img_size / 544 * aa) for aa in (24, 48, 96, 192, 384)]" can you let me knw how to make it generic so that the scales are adjusted irrespective of img size because when i am trying to do anchor free with input resolution 800x1440 i am still ending up with the same shape of feature map
Can you pls share your thougths ???

feiyuhuahuo commented 2 years ago

You see I have included cfg.img_size in, so that the model is adapted to different img sizes. You just need to modify the img_size in config.py

abhigoku10 commented 2 years ago

@feiyuhuahuo so generally in configy img_size should be given as [800,14440] as a list then the self.scales the formulae should change right cfg.img_size/544 so what value to be given and aa values also how to adapt that