Rotation angle ground truth and image resizing

mattroos commented 6 years ago

I'm concerned that resizing of images during training may result in ground truth angles that do not match actual angle of boxes in resized images. E.g., if I compute the ground truth on 480x320 images and then resize them to 320x320, the GT angle will not match that of the image. If the model were trained consistently in this manner, that'd be okay. But if I have input images with a variety of sizes, and reshape them to a fixed size (or set of fixed sizes, as was done for the arxiv paper) then the discrepancy between GT angle and angle in the resized image will not be consistent across the training set.

As a more concrete example, consider two images--one of 480x320 and one of 320x320, both with a box at 45 degrees (pi/4 radians, in the GT text files). After resizing the 480x320 image to 320x320, the pixel-based box angle will be 56.3 degrees. Yet during training, both images would have a GT of 45 degrees (unless I've missed some code that updates the GT values based on image resizing), confounding the training.

Am I correct that this is a potential problem?

MichalBusta commented 6 years ago

Hi Matt, related code:

https://github.com/MichalBusta/caffe/blob/darknet/src/caffe/layers/ondisk_data_layer.cpp (line 454, 335)

there is always place for mistakes ... but I think that reading data and transforms are correct. Michal

mattroos commented 6 years ago

Thanks.

MichalBusta / DeepTextSpotter

Rotation angle ground truth and image resizing #70