hizhangp / yolo_tensorflow

Tensorflow implementation of YOLO, including training and test phase.
MIT License
799 stars 442 forks source link

Two question of three lines of codes. #55

Closed XiangqianMa closed 5 years ago

XiangqianMa commented 6 years ago

image = cv2.resize(image, (self.image_size, self.image_size)) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32) image = (image / 255.0) * 2.0 - 1.0)

I have two questions for these three lines of codes.

  1. After 'resize' operation, why do we need to use the cvColor function?

  2. What's the function of the third code?

I want to train my own data, but my sample's resolution is too low. After i do these tree operation, the result is terrible. So i just want to resize my sample without the last two operation. But after I read your code, I don't know if there are any influence if I remove them.(I am not a native English speaker, thanks for your answer.)

Madi200 commented 5 years ago

1- image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32): We know image has three channels, R for Red, B for Blue and G for Green. When we read image using cv2 : imageMatrix = cv2.imread(imagePath) . The Blue channel from RGB comes first, followed by Green and then Red. So we convert it from BGR to RGB using the line mentioned at the start of this ans.

2- image = (image / 255.0) * 2.0 - 1.0) This line is for just normalizing the data which helps the optimization algorithm like Gradient Descent or Adam in fast convergence.

1st point does not hurt training of model, while the 2nd point does not hugely impact the training process.

XiangqianMa commented 5 years ago

@Madi200 Thanks for your answer !