Why do we reverse the final dim of the image in "prep_image"?

ayooshkathuria / YOLO_v3_tutorial_from_scratch

Accompanying code for Paperspace tutorial series "How to Implement YOLO v3 Object Detector from Scratch"

2.32k stars 724 forks source link

In this tutorial, there is a function preparing the image as below: `def prep_image(img, inp_dim): """ Prepare image for inputting to the neural network.

Returns a Variable 
"""

img = cv2.resize(img, (inp_dim, inp_dim))
img = img[:,:,::-1].transpose((2,0,1)).copy()
img = torch.from_numpy(img).float().div(255.0).unsqueeze(0)
return img`

We use this line to reverse the final dim of img and transpose it: img = img[:,:,::-1].transpose((2,0,1)).copy() I know that we transpose it because we want the channels in order of RGB. But why should we reverse it first?

ayooshkathuria / YOLO_v3_tutorial_from_scratch

Why do we reverse the final dim of the image in "prep_image"? #66