Closed stephanecharette closed 3 years ago
@stephanecharette Hi,
I know the anchor-calculating code has a bit of randomness in it, so every time I run it I get slightly different results.
Yes, there is random initialization in the k-means++ approach https://en.wikipedia.org/wiki/K-means%2B%2B
When you run command darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416
You see:
So you can run it several times and choose the anchors with the highest IoU.
darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416
Try to use this command with flag -show
(you should do it on OS with GUI: Windows / Linux+Goneme/KDE/... )
darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416 -show
Check that the anchors cover most of the points evenly. If not, then
-num_of_clusters 9
Are the anchors zero-based, or one-based? I assume zero-based
Yes, masks of anchors are 0-based.
So try
[yolo]
mask = 0,1,2
instead of
[yolo]
mask = 1,2,3
It was a mistake in yolov3-tiny and yolov4-tiny versions. Which we shouldn't fix in the default models, beacause we ahve many of pre-trained models with these masks.
And just as importantly, how do we reconcile this statement:
so for YOLOv4 the 1st-[yolo]-layer has anchors smaller than 30x30, 2nd smaller than 60x60, 3rd remaining, and vice versa for YOLOv3.
From what I can see, the 1st [YOLO] section has anchors 81,82, 135,169, and 344,319, all of which are larger than 30x30, not smaller.
And even in the 2nd [YOLO] section, only the very first anchor of 23,27 would be smaller than 30x30, so I'm very confused.
Actually you should use:
The order of the [yolo] layers is different in different models.
It is not related to anchors, but it can slightly improve accuracy if you use pre-trained weights for training.
You can try to add line stopbackward=800
to https://github.com/AlexeyAB/darknet/blob/005513a9db14878579adfbb61083962c99bb0a89/cfg/yolov4-tiny.cfg#L198
So it will freeze all layers before this one for the first 800 iterations, so randomly initialized layers will not produce random gradient and will not destroy information in the pre-trained weights.
After 800 iterations, these layers will already be trained and will not contain random weights.
Or for very large models you can try to train yolov4-p5-frozen.cfg
with stopbackward=1
https://github.com/AlexeyAB/darknet/blob/005513a9db14878579adfbb61083962c99bb0a89/cfg/yolov4-p5-frozen.cfg#L1698
with pre trained weigths https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-p5.conv.232
In this case stopbackward=1
will freeze all previous layers throughout the whole training, so training will be 2x-3x faster, will consume less memory so you can fit large mini-batch (lower subdivisions= in cfg) or higher resolution in GPU.
More details about large models: https://github.com/AlexeyAB/darknet/issues/7838#issue-930834186
@AlexeyAB - thanks for posting these details. I too am trying to customize anchors for a custom yolov4-tiny model.
I'm still a little confused... what should @stephanecharette set the masks to for lines 227 and 278 above?
@AlexeyAB sorry but you have not answered to the most critical (and misterious) part of Stephane question:
And just as importantly, how do we reconcile this statement:
so for YOLOv4 the 1st-[yolo]-layer has anchors smaller than 30x30, 2nd smaller than 60x60, 3rd remaining, and vice versa for YOLOv3.
From what I can see, the 1st [YOLO] section has anchors 81,82, 135,169, and 344,319, all of which are larger than 30x30, not > > smaller. And even in the 2nd [YOLO] section, only the very first anchor of 23,27 would be smaller than 30x30, so I'm very confused.
Yolo doesn't respect his own rules?
Another question, does changing the network size affect the "theoretical" 30x30 and 60x60 limits?
@arnaud-nt2i
Yolo doesn't respect his own rules?
There is detailed answer for this question: https://github.com/AlexeyAB/darknet/issues/7856#issuecomment-874147909
Actually you should use:
Another question, does changing the network size affect the "theoretical" 30x30 and 60x60 limits?
In general yes.
But you also should pay attention to rewritten_box
values during training, if it is higher than >5%, then try to move more anchors (actually move masks) from [yolo] layer with low resolution
to [yolo] layer with high resolution
Another question, does changing the network size affect the "theoretical" 30x30 and 60x60 limits?
Try to keep this rule. If you change the network resize - then recalculate anchors for the new network size
ok thank you for those explanations, It's the first time I read about the rewritten_box values in relation to Anchors... I have read all anchors issues since 2018 but never seen an as clear explanation. I will try that and change mask number and filter if needed. One more thing, In all my tries, with small and big dataset (up to 300 000 pics): 1) SGDR works (way) better stan steps, 2) Batch_normalyse=2 better (little bit) than 1 (for minibatch from 2 to 5) 3) I never had problems with dynamic_minibatch=2 and 0.9 factor instead of 0.8.
@arnaud-nt2i
So is this combination the best [convolutional] batch_normalize=2
+ [net] dynamic_minibatch=1 policy=sgdr
for your dataset? Is your dataset indoor/outdoor/..., urban/agronomic/biology...?
Did you try [net] letter_box=1
and/or [net] ema_alpha=0.9998
?
And new cfg-file/pre-trained weights: yolov4-csp-x-swish.cfg, yolov4-p5.cfg, yolov4-p6.cfg
https://github.com/AlexeyAB/darknet#pre-trained-models
I never had problems with dynamic_minibatch=2 and 0.9 factor instead of 0.8.
Does it solve Out of memory issue, or does it increase accuracy?
int new_dim_b = (int)(dim_b * 0.9);
instead of https://github.com/AlexeyAB/darknet/blob/d669680879f72e58a5bc4d8de98c2e3c0aab0b62/src/detector.c#L216
@AlexeyAB
So is this combination the best [convolutional] batch_normalize=2 + [net] dynamic_minibatch=1 policy=sgdr for your dataset?
Yes for small agro/bio and big outside, with swish and sgdr_cycle
= nb iter for 1 epoch and Cycle factor =2
Haven't tried letter box
because I compute mean ratio of pics and set network size with the same ratio (ex 704*544)
Don't know about ema_alpha=0.9998
... what is that ?
int new_dim_b = (int)(dim_b * 0.9)
alow higher minibatch and faster/more accurate learning...
never had Out of memory with this (and I am always maximizing VRAM usage) playing with resolution and random coef on my 3090 and 1660Ti;
(Haven't tried new optimized_memory
as well... because I am afraid of memory usage peak when launching the network like it uses to be the case with optimized_memory=1
I tried yolov4 csp / scaled in december/january but it was not yet ok... now, I am desperately waiting for Opencv DNN to support It.... But I am more interested by AP50 (number of detected objects) than AP (bbox coordinates) and I need good accuracy for small objects so new yolov4 (Csp, scaled) might be worse than better for me...
@arnaud-nt2i
Don't know about ema_alpha=0.9998 ... what is that ?
EMA is a custom version of SWA https://pytorch.org/blog/pytorch-1.6-now-includes-stochastic-weight-averaging/
Regardless of the procedure you use to train your neural network, you can likely achieve significantly better generalization at virtually no additional cost with a simple new technique now natively supported in PyTorch 1.6, Stochastic Weight Averaging (SWA) ... Averaged SGD is often used in conjunction with a decaying learning rate, and an exponential moving average (EMA), typically for convex optimization. In convex optimization, the focus has been on improved rates of convergence.
You can try to train this model: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-csp-x-swish.cfg with this pre-trained weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-csp-x-swish.conv.192 from https://github.com/AlexeyAB/darknet#pre-trained-models
ok, will try ema_alpha=09998
in some of my next training...
But as for optimizer, Radam + Lookahead =Ranger seems to give higher gain ( and as importantly less sensibility to initial lr)
: https://lessw.medium.com/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d
But it is already in todo list ^^
@arnaud-nt2i For the most of my experiments and other papers:
ok that seems fair, nothing replaces old good long trainings !
Hi @arnaud-nt2i
- Batch_normalyse=2 better (little bit) than 1 (for minibatch from 2 to 5)
Isn't Batch_normalise should be either 0 or 1?
@cpsu00 batch_normalyse=1 is default param for yolov4, batch_normalyse=0 mean no normalization (not really good) batch_normalyse=2 in Cross batch norm, allow use of lower minibatch_size (upgrade of batch_normalyse)
see batchnorm_layer.c
Oh, I didn't notice that. Thanks!
@stephanecharette Hi,
I know the anchor-calculating code has a bit of randomness in it, so every time I run it I get slightly different results.
Yes, there is random initialization in the k-means++ approach https://en.wikipedia.org/wiki/K-means%2B%2B
When you run command
darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416
You see:
* anchors * IoU
So you can run it several times and choose the anchors with the highest IoU.
darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416
Try to use this command with flag
-show
(you should do it on OS with GUI: Windows / Linux+Goneme/KDE/... )darknet detector calc_anchors DarkPlate.data -num_of_clusters 6 -width 416 -height 416 -show
* You will see the point cloud, where each pixel is a relative size of object in training dataset (x,y-coord of pixel in cloud === w,h-size of object in training dataset) * And you will see anchors, they look as bounded boxes with left-top coord at the (0,0).
Check that the anchors cover most of the points evenly. If not, then
* or try to add some additional anchors manually * or try to use more anchors, f.e. 9: `-num_of_clusters 9`
Are the anchors zero-based, or one-based? I assume zero-based
Yes, masks of anchors are 0-based.
So try
[yolo] mask = 0,1,2
instead of
[yolo] mask = 1,2,3
It was a mistake in yolov3-tiny and yolov4-tiny versions. Which we shouldn't fix in the default models, beacause we ahve many of pre-trained models with these masks.
And just as importantly, how do we reconcile this statement:
so for YOLOv4 the 1st-[yolo]-layer has anchors smaller than 30x30, 2nd smaller than 60x60, 3rd remaining, and vice versa for YOLOv3.
From what I can see, the 1st [YOLO] section has anchors 81,82, 135,169, and 344,319, all of which are larger than 30x30, not smaller. And even in the 2nd [YOLO] section, only the very first anchor of 23,27 would be smaller than 30x30, so I'm very confused.
Actually you should use:
* small anchors for [yolo] layer with high resolution * big anchors for [yolo] layer with low resolution
The order of the [yolo] layers is different in different models.
Does that mean maks won't change if I use a pre-trained model? If I change it to [3,4,5], [0,1,2], I can't use the pre-trained model. Okay?
Of course, do not change the masks or anchors on pre-trained models! This only works if you are training your own custom network.
Of course, do not change the masks or anchors on pre-trained models! This only works if you are training your own custom network.
ok thank you !
How should I change the .cfg
file to increase the number of anchor clusters? For YOLOv4-tiny
default value is 6, and I want to try a higher number to test mAP. Any guidance will be appreciated.
@stephanecharette @AlexeyAB I have trained two Yolov4 models. one using resolution 416x 416 and the other 512x512. however, the model with 512x512 has a lower mAP than the 416x416. It is confusing for me that it should be the opposite? the input images were all equal size 1008 x 1008. any help will be appreciated.
anchor generated at 416: 10, 10, 18, 8, 8, 18, 12, 12, 14, 14, 16, 15, 18, 18, 21, 21, 25, 25 ...............90.52% IOU
How should I change the
.cfg
file to increase the number of anchor clusters? ForYOLOv4-tiny
default value is 6, and I want to try a higher number to test mAP. Any guidance will be appreciated.
Hi @MrGolden1 if you want to increase the anchor box size you should also increase the detection heads of yolo. see Yolov4-tiny-3l.cfg
@stephanecharette @AlexeyAB the other question I have is what if I forget the resolution I used for training but I have the weight file, is there any way to know the training time resolution from the weight file?
The image size is irrelevant and ignored by Darknet. The only size that matters is the width and height in the cfg. Your images could be 999999x999999 and Darknet will still resize the images to match the network dimensions.
Thanks for your reply @stephanecharette okay let's forget about image size. whatever image size I have given, isn't my model trained with network resolution (width and height in the cfg) size 512x512 give better results than 416x416? in my case the 512 x 512 .cfg gives very low mAP than the 416 x 416. I am very confused
I think you're right. recalculating the anchor frame will result in a decrease in the overall accuracy of the model. My experiment is consistent with yours, and the default anchor works best. Although I do not know why this is the case, it is possible from my analysis that the anchors of K-means + + clustering do not cover all scales, most likely because the target box in the dataset is fixed in one scale.
@AlexeyAB I know about this line in the readme:
I've avoided the topic of re-calculating anchors for the past few years. But people ask on the Discord server, and truth is, I'd like to know how to do it as well! :) Every time I try to do it, the results are worse than the default anchors, so I assume that I'm doing it wrong and I'm not enough of an expert.
Say we use this license plate project as an example: https://github.com/stephanecharette/DarkPlate
The default anchors in YOLOv4-tiny is this:
I know the anchor-calculating code has a bit of randomness in it, so every time I run it I get slightly different results. For my 416x416 YOLOv4-tiny config file, I run this command:
The results I get look like one of these lines:
First thing I do is pick one of the anchor lines I list above. (They're all very similar, off by just a few pixels.)
Let say we use this line for our example:
anchors = 12, 19, 24, 37, 41, 75, 95, 38, 148, 85, 245, 151
Then I look for each
[yolo]
section in the .cfg file and replace theanchors = ...
line with one we selected above.Lastly comes these instructions:
Considering the default anchors are
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
, the default YOLOv4-tiny.cfg has 2 YOLO sections with these masks and anchors:Lines 226-228:
And lines 277-279:
Are the anchors zero-based, or one-based? I assume zero-based, so line 227 refers to:
And line 278 refers to:
Is it intentional that mask index #3 (
81,82
) is referenced in both YOLO sections, or is that a typo? Should the mask be1,2,3
and4,5,6
or0,1,2
and3,4,5
?And just as importantly, how do we reconcile this statement:
From what I can see, the 1st [YOLO] section has anchors 81,82, 135,169, and 344,319, all of which are larger than 30x30, not smaller.
And even in the 2nd [YOLO] section, only the very first anchor of
23,27
would be smaller than 30x30, so I'm very confused.But even without understanding all of this, I went ahead and trained 2 networks, one with the default anchors and the other with some new custom anchors. I did not change the mask, only the
anchors = ...
line. (See attached .cfg file.)This is the chart.png when I train with the default YOLOv4-tiny anchors:
And this is the chart.png file when I use the custom anchors:
Can you help clear up the confusion and various questions?
DarkPlate.cfg.txt