qfgaohao / pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.
https://medium.com/@smallfishbigsea/understand-ssd-and-implement-your-own-caa3232cd6ad
MIT License
1.4k stars 531 forks source link

How to modify the image_size in the ssd/config directory #22

Closed wangnamu closed 5 years ago

wangnamu commented 5 years ago

Hello @qfgaohao , I want to reduce the image_size in the mobilenetv1_ssd_config.py file, such as from 300 to 100, if you adjust this, you do not need to modify specs = [      SSDSpec(19, 16, SSDBoxSizes(60, 105), [2, 3]),      SSDSpec(10, 32, SSDBoxSizes(105, 150), [2, 3]),      SSDSpec(5, 64, SSDBoxSizes(150, 195), [2, 3]),      SSDSpec(3, 100, SSDBoxSizes(195, 240), [2, 3]),      SSDSpec(2, 150, SSDBoxSizes(240, 285), [2, 3]),      SSDSpec(1, 300, SSDBoxSizes(285, 330), [2, 3]) ],Thank you

wangnamu commented 5 years ago

:)

qfgaohao commented 5 years ago

Hi @wangnamu , you can modify mobilenetv1_ssd_config.py and mobilenetv1_ssd.py to use a new size. It's not very straightforward at the moment, as you have to coordinate the generated anchors/priors and the classification head. Understanding the process of anchor/prior generation can also help. Btw, if you aim to increase the speed, using a lighter base net might be easier and better than using smaller input. For example, using a mobile net with width_multi 0.5 will significantly increase the speed.

wangnamu commented 5 years ago

I will try a lighter base network, thank you very much.

jperezrua commented 5 years ago

@qfgaohao Hello there! I am actually interested in trying input images of size 512x512. Can you please point me out in the rigth direction to achieve this?

I am currently using mobile_net_v2 by the way

Thanks a lot!

HongChow commented 5 years ago

Hi @wangnamu , you can modify mobilenetv1_ssd_config.py and mobilenetv1_ssd.py to use a new size. It's not very straightforward at the moment, as you have to coordinate the generated anchors/priors and the classification head. Understanding the process of anchor/prior generation can also help. Btw, if you aim to increase the speed, using a lighter base net might be easier and better than using smaller input. For example, using a mobile net with width_multi 0.5 will significantly increase the speed.

@qfgaohao hi, may I ask a stupid question: could we use the pretrained model if we used a total different image_size such as 900*900?

qfgaohao commented 5 years ago

@HongChow It may still help, even not as much as using pretrained models with similar data. As far as I know, transfer learning in computer vision is not working as well as in Natural Language Processing. Pretrained models mainly help weight intialization rather than final representation learning.

phamkhactu commented 4 years ago

Hi @qfgaohao I want to change config for car detection, I know that the config has used for 21 classifications, but i train the model again with car UA-Detrac. The model shows bad. such as: one car have 2 or 3 anchors predictions, not well to catch car. Thanks in adavance.