yijingru / BBAVectors-Oriented-Object-Detection

[WACV2021] Oriented Object Detection in Aerial Images with Box Boundary-Aware Vectors
MIT License
466 stars 87 forks source link

RGB vs BGR images for DOTA #98

Open batic opened 3 years ago

batic commented 3 years ago

Dear @yijingru

According to docs, all the resnet models were trained on RGB (normalised) images.

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].

As far as I can tell, the DOTA dataset loader uses cv2.imread, meaning the images are fed into the pretrained network as BGR non-normalised data.

Checking the code I did not see any channel re-ordering or normalisation, but I might have missed something. Could you please confirm if the DataLoader for DOTA should be fixed?

yijingru commented 3 years ago

Hi, I think I didn't do re-ordering and normalization because with Batch Normalization it would be fine. But your point is great, it would make more stable training and it may reduce the loss NAN problem. Thanks!