Closed enchainingrealm closed 5 years ago
@enchainingrealm you can specify your own anchors from k-means results in the 3 YOLO layers of the cfg file: https://github.com/ultralytics/yolov3/blob/9cf5ab0c9d41231148e8f6df23a4797ffa8e6d1a/cfg/yolov3-spp.cfg#L815
Remember though these are in units of pixels at the expected inference size, not at the native image size. If you plan on running inference at 416 pixels for example, then you should multiply your k-means results by 416/1376 = 0.30.
Where do I set the expected inference size?
In the .cfg file I've already set the "width" and "height" hyper-parameters, namely "width=1376" and "height=800". Those are the native dimensions of all my training and test images.
During training, or testing, or inference, you always set the image size using the same argparser argument --img-size
. So if you plan on training at 416, then multiply all your kmeans results by 416/1376 like I said before.
https://github.com/ultralytics/yolov3/blob/9cf5ab0c9d41231148e8f6df23a4797ffa8e6d1a/train.py#L310
Thank you for the quick responses. I still have a few questions about the concept of size.
--img-size
argument is king, then what is the point of the width
and height
hyperparameters in the .cfg files?--img-size
to be 1376. Does the pipeline pad the shorter side of my images to get 1376-by-1376 images?--img-size
to be 800. Does the pipeline scale my images down so that the longer side is 800 pixels, and then pad the shorter side to get 800-by-800 images?@enchainingrealm the cfg files were created by darknet authors, they use the width and height parameters there in their repositories.
I suggest you google letterboxing, its very simple. You can see examples of this in the README.
Seems like I was understanding the concept of letterboxing correctly (i.e. pad a rectangular image to make it square without changing the aspect ratio of the contents.)
I re-trained on my data after setting the --img-size
argument to 1376, and the results are now satisfactory.
Thank you for the clarifications. I'm closing this issue now.
Seems like I was understanding the concept of letterboxing correctly (i.e. pad a rectangular image to make it square without changing the aspect ratio of the contents.)
I re-trained on my data after setting the
--img-size
argument to 1376, and the results are now satisfactory.Thank you for the clarifications. I'm closing this issue now.
hello, sorry to bother you. I just wanna how which codes you used to calculate your anchor? I used different kmeans codes, and just got different results. But both of them didn't work well on my data.
@Chida15 I would simply start with the default anchors in yolov3-spp.cfg.
If you want to try kmeans anchors you can use kmeans_targets(): https://github.com/ultralytics/yolov3/blob/58f868a79ad755b68fd75556fb1d946cdd3ab8e5/utils/utils.py#L571-L610
@Chida15 I would simply start with the default anchors in yolov3-spp.cfg.
If you want to try kmeans anchors you can use kmeans_targets(): https://github.com/ultralytics/yolov3/blob/58f868a79ad755b68fd75556fb1d946cdd3ab8e5/utils/utils.py#L571-L610
Could I use this code calculate the yolov3-tiny anchors, because the latest code I have encountered the following problem: “AttributeError: 'Tensor' object has no attribute 'T'”
Thank you very much for your pro. here, I have a question, why do you use kmeans algorithm simply and directly , which is different from original author's? @glenn-jocher
@Broad-sky thanks for your question. The choice to use k-means for generating anchors is more of a preference rather than a rule. It's a common practice in the community, and it usually provides good starting points for anchor sizes.
If you're curious about the approach, I suggest looking into the differences in the anchor generation methods and how they might impact the training on different datasets.
Always happy to help!
I'm training and testing YOLOv3 on my own dataset. Every image in my dataset is 1376 width by 800 height.
I run K-Means on my training set to cluster the ground truth bounding box dimensions into 9 clusters. Each cluster mean is a dimension pair (box width, box height). I get 9 cluster means which I use as my anchors.
In my .cfg file, I Ctrl+F "anchors" and I paste in my anchors (I do this for each of the three YOLO layers.)
YOLOv3 performs poorly on both my training and test sets. It detects small objects with high precision but fails to detect large objects. I'm assuming I forgot to scale the anchors in my .cfg file. What's the procedure to define anchors for a custom dataset?