Megvii-BaseDetection / YOLOX

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Apache License 2.0
9.39k stars 2.2k forks source link

Doubt regarding training data image size #636

Open debwhat opened 3 years ago

debwhat commented 3 years ago

I am creating some custom training data. The image size varies for some image it might be 1800(w)x1200(h) and for other it might be 1800(w)x2400(h) etc. Storing the annotation data in pascal VOC format keeps bounding box information in absolute pixel values based on the input image size. So I have few doubts.

A) During inference phase does all image have to be of constant size or can I use input image of varying dimensions and the yolox will handle it on its own? B) During the training phase, would this repository support varying image input format out of the box or should I have to resize the images first to a constant value that matches final inference input image size? C) How much is impact on accuracy if my labelled components in the training phase gets squished or stretched somewhat during the inference phase due to varying image size?

Abigale-Xin commented 2 years ago

I have the same question.

FateScript commented 2 years ago

A) For inference, you input size could vary. B) For training, you input size should be multiples of 32 C) We could not tell you how much the impact on accuracy, but generally speaking, if you train/inference on different scale/size, the accuracy will drop.