qfgaohao / pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.
https://medium.com/@smallfishbigsea/understand-ssd-and-implement-your-own-caa3232cd6ad
MIT License
1.39k stars 530 forks source link

Facing issue on detecting Fenders in atomobile through SSD + MobilenetV2 #151

Closed jaidee-coder007 closed 3 years ago

jaidee-coder007 commented 3 years ago

Hi Team,

I am working in SSD + MobilenetV2 architecture for detecting sub-parts of the vehicle like - Bumper, fender, Quarter Panel, etc...

I am using almost the same architecture given in your git repo https://github.com/qfgaohao/pytorch-ssd. I am able to identify objects like Front Bumper - 50% mAP and Rear Bumper - 90% mAP. But, for other especially front and rear Fenders(Where detecting distinct features are less compare to other sub-parts) is becoming challenging.

Best Average validation loss: 1.95 , regression loss : 0.45, classification ~ 1.5

Giving classwise mAP: AP: 23.66% (door) AP: 41.98% (frontBumper) AP: 0.48% (frontFender) AP: 38.65% (hood) AP: 94.74% (rearBumper) AP: 0.74% (rearFender) mAP: 33.38%

Architecture Changes done - commented section - first layer in classification and regression header: classification_headers = ModuleList([

SeperableConv2d(in_channels=round(576 width_mult), out_channels=6 num_classes, kernel_size=3, padding=1),

    #SeperableConv2d(in_channels=1280, out_channels=6 * num_classes, kernel_size=3, padding=1),
    SeperableConv2d(in_channels=512, out_channels=prior_count * num_classes, kernel_size=3, padding=1),
    SeperableConv2d(in_channels=256, out_channels=prior_count * num_classes, kernel_size=3, padding=1),
    SeperableConv2d(in_channels=256, out_channels=prior_count * num_classes, kernel_size=3, padding=1),
    Conv2d(in_channels=64, out_channels=prior_count * num_classes, kernel_size=1),
])    regression_headers = ModuleList([
    # SeperableConv2d(in_channels=round(576 * width_mult), out_channels=6 * 4,
    #                 kernel_size=3, padding=1, onnx_compatible=False),
    # SeperableConv2d(in_channels=1280, out_channels=6 * 4, kernel_size=3, padding=1, onnx_compatible=False),
    SeperableConv2d(in_channels=512, out_channels=prior_count * 4, kernel_size=3, padding=1, onnx_compatible=False),
    SeperableConv2d(in_channels=256, out_channels=prior_count * 4, kernel_size=3, padding=1, onnx_compatible=False),
    SeperableConv2d(in_channels=256, out_channels=prior_count * 4, kernel_size=3, padding=1, onnx_compatible=False),
    Conv2d(in_channels=64, out_channels=prior_count * 4, kernel_size=1),
])

If required I can share more details over the above info.

Took 4 priors: in generate_ssd_priors specs = [ commented # SSDSpec(19, 16, SSDBoxSizes(60, 105), [2, 3]), commented # SSDSpec(10, 32, SSDBoxSizes(105, 150), [2, 3]), SSDSpec(5, 64, SSDBoxSizes(150, 195), [2, 3]), SSDSpec(3, 100, SSDBoxSizes(195, 240), [2, 3]), SSDSpec(2, 150, SSDBoxSizes(240, 285), [2, 3]), SSDSpec(1, 300, SSDBoxSizes(285, 330), [2, 3]), ]

As part of hyper parametrs:

used : multistep lr - 80,100,120,150 batch : size128 validation check after : 5 epoch base net : mobilenetv2 SGD = momentum used = 0.9 and 0.5 not freezing any layer hence fine tuning from scratch

Data augmentation: Geometrix expansion, random image crop. Avoided photometric changes since not having enough features for some sub-parts. Also not doing Randomirror.

Our Objective is to not increase the size of a model to more than 12-13 MB but want to attain at least 75% mAP for all classes so that we can launch the model on our auto platform. Your suggestions will be welcome a lot!!

Kindly help. Asap, We need to push it to a production-ready platform.