ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.26k stars 16.44k forks source link

how to use K-means++ instead of K-means for anchor box optimization #10661

Closed gjgjos closed 1 year ago

gjgjos commented 1 year ago

Search before asking

Question

I want to use K-means++ instead of K-means for anchor box optimization. Is there any guide for using K-means++ instead of K-means in autoanchor??

Additional

No response

gjgjos commented 1 year ago

Help me please...!

jl749 commented 1 year ago

you can try kmeans2 instead of kmeans https://github.com/ultralytics/yolov5/blob/064365d8683fd002e9ad789c1e91fa3d021b44f0/utils/autoanchor.py#L84

https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans2.html#scipy.cluster.vq.kmeans2 set minit="++" to enable kmeans++ initialization

github-actions[bot] commented 1 year ago

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

nortorious commented 1 year ago

you can try kmeans2 instead of kmeans

https://github.com/ultralytics/yolov5/blob/064365d8683fd002e9ad789c1e91fa3d021b44f0/utils/autoanchor.py#L84

https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans2.html#scipy.cluster.vq.kmeans2 set minit="++" to enable kmeans++ initialization

Can you tell me "where is the minit",thanks

glenn-jocher commented 1 year ago

@nortorious the minit parameter is not explicitly mentioned in the YOLOv5 code, but it is set by default to 'random' in the kmeans2 function from SciPy. This means that K-means++ initialization is already used behind the scenes. Therefore, there is no need to specify minit="++" as it is already taken care of in the code.

Let me know if you have any more questions or need further assistance!

nortorious commented 1 year ago

@nortorious the minit parameter is not explicitly mentioned in the YOLOv5 code, but it is set by default to 'random' in the kmeans2 function from SciPy. This means that K-means++ initialization is already used behind the scenes. Therefore, there is no need to specify minit="++" as it is already taken care of in the code.

Let me know if you have any more questions or need further assistance! If I replace kmenas with kmeans2, is k-means++ successfully used? Thank you so much

jl749 commented 1 year ago

hello @glenn-jocher

I have had few questions about utils/autoanchors.py that I have been keeping to myself. but now you are here I would like to ask!


yolov5 defined 0.98 fitness score as good anchors when applying genetic algorithm. https://github.com/ultralytics/yolov5/blob/064365d8683fd002e9ad789c1e91fa3d021b44f0/utils/autoanchor.py#L49

as far as I understand yolov5 calculates initial kmean centroids based on wh(box width, height) distances

Unlike traditonal yolo https://github.com/AlexeyAB/darknet/blob/master/scripts/gen_anchors.py Is there a reason why you used Euclidean distance over IOU?

becuase when I tried IOU based kmean clustering, it created much higher initial fitness score. (somewhere around 0.98)


Also, what is the intuition behind the fitness score? I understood it as ratio between bboxes & anchors. but we are only considering the lowest ratio out of width and height. shouldn't they be jointly considered? https://github.com/ultralytics/yolov5/blob/064365d8683fd002e9ad789c1e91fa3d021b44f0/utils/autoanchor.py#L39


finally, I think minit="++" and "random" are different initialization methods

"random" to sample centroids from input Gaussian distribution.


hope my questions were clear! Thank you

nortorious commented 1 year ago

thank you very much

glenn-jocher commented 1 year ago

@nortorious hi there! Thank you for your questions. I'm happy to provide some answers:

  1. The choice of using Euclidean distance over IOU in YOLOv5 for calculating initial k-means centroids is based on the fact that the goal of anchor optimization is to find clusters of similar-sized objects. Euclidean distance between the widths and heights of bounding boxes is a common method for measuring similarity in this context. While using IOU may generate higher initial fitness scores, it may not necessarily lead to better clustering for object detection.

  2. The fitness score in YOLOv5 is indeed based on the ratio between the smallest dimension of the bounding box and the corresponding anchor. This approach focuses on the aspect ratio of the objects rather than the absolute size. Joint consideration of width and height could potentially lead to more accurate clustering, but the current implementation aims to find the best aspect ratio match between objects and anchors.

  3. Regarding the minit parameter, I apologize for the confusion in my previous reply. You are correct that minit=["++"] and minit=["random"] are different initialization methods in the kmeans2 function from SciPy. In the YOLOv5 code, the minit parameter is set to the default value of 'random', which means K-means++ initialization is not explicitly used. If you would like to use K-means++ initialization, you can modify the code accordingly.

I hope this helps! If you have any further questions, feel free to ask.

nortorious commented 1 year ago

@nortorious hi there! Thank you for your questions. I'm happy to provide some answers:

  1. The choice of using Euclidean distance over IOU in YOLOv5 for calculating initial k-means centroids is based on the fact that the goal of anchor optimization is to find clusters of similar-sized objects. Euclidean distance between the widths and heights of bounding boxes is a common method for measuring similarity in this context. While using IOU may generate higher initial fitness scores, it may not necessarily lead to better clustering for object detection.
  2. The fitness score in YOLOv5 is indeed based on the ratio between the smallest dimension of the bounding box and the corresponding anchor. This approach focuses on the aspect ratio of the objects rather than the absolute size. Joint consideration of width and height could potentially lead to more accurate clustering, but the current implementation aims to find the best aspect ratio match between objects and anchors.
  3. Regarding the minit parameter, I apologize for the confusion in my previous reply. You are correct that minit=["++"] and minit=["random"] are different initialization methods in the kmeans2 function from SciPy. In the YOLOv5 code, the minit parameter is set to the default value of 'random', which means K-means++ initialization is not explicitly used. If you would like to use K-means++ initialization, you can modify the code accordingly.

I hope this helps! If you have any further questions, feel free to ask.

image image I want to know which method is right? thanks

nortorious commented 1 year ago

chatGPT tell me that the second one is right? i am not sure

glenn-jocher commented 1 year ago

@nortorious both methods shown in the images can be used for anchor box optimization in YOLOv5, and the choice between them depends on the specific requirements and characteristics of your dataset. The first method uses Euclidean distance between the widths and heights of bounding boxes to measure similarity, while the second method uses IOU.

The goal of anchor optimization is to find clusters of similar-sized objects, and different similarity metrics may yield different results. It is important to experiment and evaluate both methods with your dataset to determine which one best suits your needs and leads to better object detection performance.

nortorious commented 1 year ago

@nortorious both methods shown in the images can be used for anchor box optimization in YOLOv5, and the choice between them depends on the specific requirements and characteristics of your dataset. The first method uses Euclidean distance between the widths and heights of bounding boxes to measure similarity, while the second method uses IOU.

The goal of anchor optimization is to find clusters of similar-sized objects, and different similarity metrics may yield different results. It is important to experiment and evaluate both methods with your dataset to determine which one best suits your needs and leads to better object detection performance.

image After I used the modification method in the second picture, when I went to train the model, why did a series of indicators such as mAP be 0?

glenn-jocher commented 1 year ago

@nortorious anchor box optimization is an important step in YOLOv5 for improving object detection performance. The choice between using Euclidean distance or IOU as the similarity metric for clustering depends on the characteristics of your dataset. While both methods can be used, it is recommended to experiment and evaluate their performance with your specific dataset.

Regarding the issue you mentioned, if you used the modification method shown in the second picture and encountered issues such as mAP being 0 during model training, there could be various reasons for this. It is important to investigate further to identify the cause.

Possible reasons for mAP being 0 could include issues with the training data, incorrect configuration settings, or issues with the anchor box sizes. I recommend checking the following:

  1. Ensure that your training data is properly labeled and contains objects of different sizes and aspect ratios.
  2. Verify that your configuration settings, such as the number of classes and the input image size, are correctly specified.
  3. Check if the anchor boxes are properly initialized and cover the range of object sizes in your dataset.
  4. Double-check any modifications you made to the code, as they could potentially introduce errors.

Additionally, monitoring the training logs and any error messages generated during training can provide further insights into the issue. Analyzing the training data, reviewing the loss curves, and experimenting with different parameters can also help in troubleshooting the problem.

I hope this gives you some guidance in addressing the issue. If you have any further questions or need additional assistance, feel free to ask.