ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.21k stars 16.21k forks source link

Question Regarding img-size Argument #4191

Closed joneswilliam1 closed 3 years ago

joneswilliam1 commented 3 years ago

❔Question

Hi, hello everyone I was wondering when I set the img-size argument during the training as well as during inference to say for example '640', does that mean 640x640 or is it 480x640? similarly for img-size set to '416', does that mean 416x416?

Would really appreciate the answer. Reason being is because I trained Yolov5 model on 416 image size and would like to run inference on the same image size, however, I am using an external CSI Camera and would have to set the width and height of the camera manually. Thanks!

Edit: When running detect.py using img 640 I receive the following log on my terminal: 384x640 Done. (0.521s) When running detect.py using img 416 I receive the following log on my terminal: 256x416 Done. (0.200s)

So does that mean I should set height to 256 and width to 416 on my CSI camera when running the inference to get optimal performance?

glenn-jocher commented 3 years ago

@joneswilliam1 you are setting the long side of the image with the --img argument. The short side is handled automatically.

Mosaics will train as img x img squares.

joneswilliam1 commented 3 years ago

@glenn-jocher I see thank you very much for the clarification, also just wondering if it trains on img x img then when running the detect.py why isn't it also set to img x img since "Best inference results are obtained at the same img size that it was trained on". Thanks again!

glenn-jocher commented 3 years ago

@joneswilliam1 its very simple, train at --img x and detect at --img x. That's it.

joneswilliam1 commented 3 years ago

When running detect.py using img 640 I receive the following log on my terminal: 384x640 Done. (0.521s) When running detect.py using img 416 I receive the following log on my terminal: 256x416 Done. (0.200s)

@glenn-jocher that makes sense, but is that no the case here since it is running inference at the same 'img' (long side) but the short side is different. My apologies if this question is not very intellectual but I am very curious. Thanks.

glenn-jocher commented 3 years ago

@joneswilliam1 I don't understand what you are asking. As I said before for best results train and detect at the same --img.

joneswilliam1 commented 3 years ago

@glenn-jocher I was referring to the terminal output during inference it is saying that is running at image size of 384x640 instead of 640x640

glenn-jocher commented 3 years ago

@joneswilliam1 yes that's rectangular inference. Why would you want to to run inference on more pixels than you need?

joneswilliam1 commented 3 years ago

@glenn-jocher I see, so what's important is that the long side remains the same as what it has been trained on and the inference can be run at a rectangular shape and still obtain optimal performance. Thank you very much for the clarification and correct me if I'm wrong.

github-actions[bot] commented 3 years ago

πŸ‘‹ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 πŸš€ resources:

Access additional Ultralytics ⚑ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 πŸš€ and Vision AI ⭐!