scale the image before training

gyqiang commented 6 years ago

Thank you very much for your contribution! But one thing I don't quite understand, I see that your code only crops the image randomly, but does not scale the image before training. Don't know if my understanding is wrong?

taijizhao commented 5 years ago

According to my knowledge, the fucntion get_random_data() does the scaling. It first randomly resizes the image (enlarge or shrink, change aspect ratio), then crop or pad the resulting image to a fixed height and weights.

gyqiang commented 5 years ago

According to my knowledge, the fucntion get_random_data() does the scaling. It first randomly resizes the image (enlarge or shrink, change aspect ratio), then crop or pad the resulting image to a fixed height and weights.

Thanks for your answer. I know the reason now. However, I have new question. The loss function is different between YOLO v1 and YOLO v3, and the position loss function uses logistic regression in this code, I don't know if YOLO v3 official source code loss is the same as this code. Do you know the YOLO v3 loss function and the parameter?

taijizhao commented 5 years ago

According to my knowledge, the fucntion get_random_data() does the scaling. It first randomly resizes the image (enlarge or shrink, change aspect ratio), then crop or pad the resulting image to a fixed height and weights.

Thanks for your answer. I know the reason now. However, I have new question. The loss function is different between YOLO v1 and YOLO v3, and the position loss function uses logistic regression in this code, I don't know if YOLO v3 official source code loss is the same as this code. Do you know the YOLO v3 loss function and the parameter?

I didn't check the loss in the v1 version, but I know it does use logistic regression in the v3 code. I think it is because the position target is scaled to [0 1] so the logistic regression is used to prevent getting prediction out of bound?

taijizhao commented 5 years ago

But another thing confuses me is that the get_random_data() uses too much random scaling and aspect ratio. The default scaling factor is set to between 0.25 and 2, which means the image can be up to 2 times larger or down to 1/4 of the original size. However, the anchors are k-means results of the original size. Then why these anchors are still meaningful?

gyqiang commented 5 years ago

I didn't pay attention to what you said this kind of situation.I express my point of view, but it could be wrong. I think that the size of the anchor and target box is close , the effect is better. This scale doesn't matter, because it is used for training and enhance the ability to predict.

taijizhao commented 5 years ago

I didn't pay attention to what you said this kind of situation.I express my point of view, but it could be wrong. I think that the size of the anchor and target box is close , the effect is better. This scale doesn't matter, because it is used for training and enhance the ability to predict.

I think you are absolutely right about that the anchor sizes should be close to the target box, and that is why kmeans is used to generate anchors. But if I didn't miss something from the implementation, during scaling, the image and the target boxes are resized, however the anchors are still kept the same. Say a small target box of 12x12 pixels might become 3x3 pixels after random scaling, and there would be not a similar scale anchor.
This might not hurt the performance but I just have some doubt.

gyqiang commented 5 years ago

I didn't pay attention to what you said this kind of situation.I express my point of view, but it could be wrong. I think that the size of the anchor and target box is close , the effect is better. This scale doesn't matter, because it is used for training and enhance the ability to predict.

I think you are absolutely right about that the anchor sizes should be close to the target box, and that is why kmeans is used to generate anchors. But if I didn't miss something from the implementation, during scaling, the image and the target boxes are resized, however the anchors are still kept the same. Say a small target box of 12x12 pixels might become 3x3 pixels after random scaling, and there would be not a similar scale anchor. This might not hurt the performance but I just have some doubt.

I understand what you mean. But my level is limited, I don’t know how to explain it.Possibly, although the anchor has not changed, it has little effect. Its scaling is the same size on both sides of the scale, but our anchor is just in the middle. Just as the anchor obtained by clustering is not the same size as the box, but at the center of a cluster.

gyqiang commented 5 years ago

I didn't pay attention to what you said this kind of situation.I express my point of view, but it could be wrong. I think that the size of the anchor and target box is close , the effect is better. This scale doesn't matter, because it is used for training and enhance the ability to predict.

I think you are absolutely right about that the anchor sizes should be close to the target box, and that is why kmeans is used to generate anchors. But if I didn't miss something from the implementation, during scaling, the image and the target boxes are resized, however the anchors are still kept the same. Say a small target box of 12x12 pixels might become 3x3 pixels after random scaling, and there would be not a similar scale anchor. This might not hurt the performance but I just have some doubt.

However, as you may say, the zoom is too large. You can test the effect of scaling on the test results.

YunYang1994 commented 5 years ago

https://github.com/YunYang1994/tensorflow-yolov3 hope it helps you

qqwweee / keras-yolo3

scale the image before training #270