ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.63k stars 16.1k forks source link

Object sizing and Image size #700

Closed marvision-ai closed 4 years ago

marvision-ai commented 4 years ago

I am sure this question has been beat to death, but nothing in the past questions really answered this, so here we go.

❔Question

Say I have an image of size 3200x1632 (all divisible by 32 but strange dimensions). When I train with the images, i use --img-size 3200 to set the largest image axis

How can I specify it to train on the exact image dimensions (same way darknet does)? Would this not be the best idea? I need to make sure objects in the image all retain their original size and aspect ratio.

I want to know that I am passing in the exact image into the network to train, because those will be the exactly the same dimensions when I use the model for inference in the field. (I need to not do any resizing since all images are pixel calibrated to do accurate sizing of bboxes).

Thank you in advance. I know your responses always help me understand your logic in a great way.

glenn-jocher commented 4 years ago

@marvision-ai you can use train.py --rect to omit mosaic, and you can set any image augmentation hyps you want in the data/hyps files.

marvision-ai commented 4 years ago

@glenn-jocher but I would like to do mosaic. I just want to be able to train and detect at specific image dimensions.

Even if I can't use mosaic and have to set rect, I still don't see how I can set my exact image dims.

glenn-jocher commented 4 years ago

@marvision-ai your options are mosaic or --rect. In addition, you are free to specify any augmentation policy you want in the hyps yamls.

marvision-ai commented 4 years ago

@glenn-jocher got it.

So I will set --img-size to 3200 and --rect and it will keep the height untouched and leave the image as is?

glenn-jocher commented 4 years ago

@marvision-ai yes --rect is just what it sounds like, rectangular training with 3200 on the long side, short side minimum viable letterboxed size to meet stride multiple constraints.

As I said, you can set augmentation policy to whatever you want in the hyps files.

marvision-ai commented 4 years ago

@glenn-jocher I understand that.

Detect does not allow you to specify --rect, only --img-size So I know its resizing the height to minimum viable letterbox that I assume is a multiple of 32. Is there a way to stop it from doing that or should I just modify it myself?

Sorry if these are silly questions. I am just used to specifying WxH in other applications but for this one application I'm at a loss for how calibration will work on the final inferenced image if I leave it up to the script to resize.

Maybe I am just looking at this wrong or maybe its a case of the Monday's...

glenn-jocher commented 4 years ago

@marvision-ai I don't know what you're asking. I recommend you detect at the same size you trained on.

surprisedong commented 4 years ago

@glenn-jocher I have images of size 256x320, if I want to set the model input size to 256*320 when training, just use --img-size 320 and --rect?

glenn-jocher commented 4 years ago

@surprisedong yes, that's correct. If you just use --img 320 --rect, the rectangles will be 320x256.