Open PythonImageDeveloper opened 5 years ago
@PythonImageDeveloper Hi,
In your case is better to keep aspect ratio, so you should train by using original repo https://github.com/pjreddie/darknet
I will implement flag for training with letter_box later.
In my experience, I usually use models for Detection on real-time video cameras, where are all frames are the same size and apsect ratio. Also usually training dataset is collected from the same video camera, so it isn't required to keep aspect ratio.
In my experience, I usually use models for Detection on real-time video cameras, where are all frames are the same size and apsect ratio. Also usually training dataset is collected from the same video camera, so it isn't required to keep aspect ratio.
, This is right when the training and inference images are same size and aspect-ratio, right? If all training images have with aspect-ratio = x and testing images with aspect-ratio = y , Is it necessary to keep aspect-ratio in the training and the testing phase?
If all training images have with aspect-ratio = x and testing images with aspect-ratio = y , Is it necessary to keep aspect-ratio in the training and the testing phase?
Then you just can compensate for this using a different network resolution when training and detecting, f.e. if training 640x480 (ar_x = ~1.3) and detection 1920x1080 (ar_y = ~1.77), then you can Train by using width=640 height=480
(ar_x = ~1.3), and then change width=448 height=256
(ar_y = ~1.75) for Detection
The main rules:
ar_x / ar_y
of images should be the same as ar_x / ar_y
of network resolution.width= and height=
in cfg-file must be multiple of 32@AlexeyAB Hi! If I trainning 640×480, will the network still get the 13×13,26×26,52×52 three Scales of feature maps(FM)?Or other sizes of FM?
@SHADOWMOOON Hi, Will be 640/32 x 480/32 Just set 640 x 480, run training and look at it )
@AlexeyAB Thanks!!!
Hi @AlexeyAB I have a question, the right answer is critical for me. I'm working on the important project, I have two projects, one is classification and another is OCR, Suppose we want to solve these problems with Deep learning methods. Accordingly your experiences, resizing the input images without keep aspect-ratio whether in training and inference mode can be achieve good result or add zero padding with keep aspect-ratio? If the answer is resizing, So in the OCR problem, Is it not to keep the aspect-ratio because of skewed of the characters? Suppose the original images have aspect-ratio = 5 and my model has input-size with aspect ratio = 2, In order to keep aspect-ratio, I have to add zero padding to big part of original images. These method cause to the CNN will have to learn that the black part of the image is not relevant? BTW, Accordingly your experience, which method you prefer to achieve the best result.