default boxes' specs for different input size

qfgaohao / pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.

https://medium.com/@smallfishbigsea/understand-ssd-and-implement-your-own-caa3232cd6ad

MIT License

1.39k stars 529 forks source link

default boxes' specs for different input size #156

Closed xmba15 closed 2 years ago

xmba15 commented 3 years ago

@qfgaohao Thank you for the project. I came here from jetson_inference repo (which uses your project to train SSD models to run on jetson boards). I think some (including me) are interested in training models on a larger input for better accuracy, which the current repo does not support.

I create scripts to automatically generate default boxes' specs here in this branch. https://github.com/xmba15/pytorch-ssd/commit/bc0d3cccf49a7342010b3e42831bb97ac8ad6ff1

and am wondering if you are okie with a PR. My current approach needs to introduce a parameter get_feature_map_size into forward function of SSD(nn.Module) class as I need to automatically get the sizes of feature maps, as can be seen here https://github.com/xmba15/pytorch-ssd/blob/arbitary_image_size/vision/ssd/ssd.py#L40

please reply if you have time to take a look and are okie with this approach. I will create a PR.

Neltherion commented 2 years ago

@xmba15 Hi! I had a question regarding the Specs that I asked from Dusty back at the jetson_inference repo: How to change SSD anchors to help better detect specific objects?

Since you wrote a script to generate the specs, is it possible for you to answer my question regarding the specs and how they work? My main goal is to understand them so that I can change the specs to better work with License Plates.

Thanks.

xmba15 commented 2 years ago

Hi @Neltherion I don't have much time now to write everything in detail, so all I can do is to recommend some literature to read.

here is the original paper, based on which this repo was created upon https://arxiv.org/abs/1512.02325

Here is a thorough explanation for the algorithm https://jonathan-hui.medium.com/ssd-object-detection-single-shot-multibox-detector-for-real-time-processing-9bd8deac0e06

Your original image will be resized (distorted) into a smaller size (300x300 for the default resolution in the paper) so the aspect ratio of the real-world objects (plates in this situation) will not be preserved anyway. So I don't think it is a good idea to hand-draft the aspect ratios by yourself. But you can change the anchors. To understand the anchors, reading the related materials (some I posted above) is necessary.

However, you can always use code in this branch of mine to generate the anchors. https://github.com/xmba15/pytorch-ssd/commit/bc0d3cccf49a7342010b3e42831bb97ac8ad6ff1

xmba15 commented 2 years ago

close due to no response from the repo's author.