AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Gaussian YOLOv3 (+3.1% mAP@0.5...0.95 on COCO) , (+3.0% mAP@0.7 on KITTI) , (+3.5% mAP@0.75 on BDD) #4147

Closed phucnhs closed 3 years ago

phucnhs commented 4 years ago

Have you tried the Gaussian object detection method?

AlexeyAB commented 4 years ago

What do you mean? Can you provide URL to article/paper?

phucnhs commented 4 years ago

This is object detection by new method this is paper : https://arxiv.org/pdf/1904.04620.pdf Here is the code, you can refer to it because I am not confident about this code : https://github.com/jwchoi384/Gaussian_YOLOv3

lsd1994 commented 4 years ago

Furthermore, on the COCO dataset [14], the AP of Gaussian YOLOv3 is 36.1, which is 3.1 higher than YOLOv3. In particular, the AP75 (i.e., strict metric) of Gaussian YOLOv3 is 39.0, which is 4.6 higher than that of YOLOv3.

It seems Gaussian YOLOv3 is better than YOLOv3 on COCO dataset, especially in strict metric.

AlexeyAB commented 4 years ago

So


https://arxiv.org/pdf/1902.09630v2.pdf

GIoU: https://github.com/AlexeyAB/darknet/issues/3249

sctrueew commented 4 years ago

@AlexeyAB Hi,

Have you tried to combine Gaussian and GIoU? If yes, Could you share the result?

Thanks

tuteming commented 4 years ago

where I can get the Gaussian YOLOv3 config file?

AlexeyAB commented 4 years ago

@tuteming

cfg files: https://github.com/jwchoi384/Gaussian_YOLOv3/tree/master/cfg BDD100k weights file: https://drive.google.com/open?id=1Eutnens-3z6o4LYe0PZXJ1VYNwcZ6-2Y

AlexeyAB commented 4 years ago

I added yolo_v3_tiny_pan3 matrix_gaussian aa_ae_mixup.cfg.txt model.

More: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968


Model (cfg & weights) network size = 544x544 Training chart Validation video BFlops Inference time RTX2070, ms mAP, %
yolo_v3_tiny_pan3 matrix_gaussian aa_ae_mixup.cfg.txt and weights-file chart video 13 19.0 ms 57.2%
Kyuuki93 commented 4 years ago

@AlexeyAB Hi, I trained yolov3-spp.cfg and yolov3-spp-gs.cfg in my Custom Data, the yolov3-spp-gs.cfg just replaced [yolo] to [Gaussian yolo]. My data was one-class dataset with 130k images, but for some reason there no validation data, so I can't report the mAP value. There are some observation:

  1. Gaussian yolo got high avgloss in the train,
  2. Network with Gaussian yolo layer get lower confidence of object,
  3. Network with Gaussian yolo layer decrease the wrong detected boxes but aggravated blinking issue in video

I think maybe in the test, * (1.0 - uc_aver)in line 454 of gaussian_yolo_layer.c was not necessary .

Three some video results, extract code :xqxu all results in the -thresh 0.5,
spp.mp4 was the result of yolov3-spp.cfg spp-gs.mp4 was the result of yolov3-spp-gs.cfg spp-gs2.mp4 was the result of yolov3-spp-gs.cfg and removed * (1.0 - uc_aver)in [gaussian_yolo_layer.c]

AlexeyAB commented 4 years ago

My data was one-class dataset with 130k images, but for some reason there no validation data, so I can't report the mAP value.

Just set valid=train.txt in obj.data file

Gaussian yolo got high avgloss in the train, Network with Gaussian yolo layer get lower confidence of object, Network with Gaussian yolo layer decrease the wrong detected boxes but aggravated blinking issue in video

You should use lower -threshold 0.2 or 0.15` for [Gaussian yolo] instead of default 0.24

I think maybe in the test, * (1.0 - uc_aver)in line 454 of gaussian_yolo_layer.c was not necessary .

It is necessary if you want to have high AP@0.5...0.95

AlexeyAB commented 4 years ago

@zpmmehrdad

I added GIoU to the [Gaussian_yolo] layer.

So now you can use for training:

[Gaussian_yolo]
mask = 0,1,2
anchors = 7,10, 14,24, 27,43, 32,97, 57,64, 92,109, 73,175, 141,178, 144,291
classes=10
num=9
jitter=.3
ignore_thresh = .5
truth_thresh = 1
iou_thresh=0.213
uc_normalizer=1.0
cls_normalizer=1.0
iou_normalizer=0.5
iou_loss=giou
scale_x_y=1.1
random=1
yrc08 commented 4 years ago

@AlexeyAB Hi! I want to use the Gaussian_yolov3_BDD.cfg training to use darknet53.conv.74 or Gaussian_yolov3_BDD.weights.

AlexeyAB commented 4 years ago

@yrc08 Use darknet53.conv.74 for training and Gaussian_yolov3_BDD.weights for detection.

yrc08 commented 4 years ago

@AlexeyAB Hi! In other words, I want to use darknet53.conv.74 when I want to train my own data set. Is my understanding correct?

LukeAI commented 4 years ago

Just to share, I found the gaussian yolo gave me a small boost in mAP, around that claimed in the paper (which is unusual in my experience!)

Gaussian YOLO gauss

Yolov3_spp_swish_scale yolo_v3_spp_scale_swish

I'm currently trying out Gaussian_yolo_spp_swish...

AlexeyAB commented 4 years ago

I'm currently trying out Gaussian_yolo_spp_swish...

Also try then to add params to each of [Gaussian_yolo] layer and train. It requires the latest version of Darknet

iou_thresh=0.3
scale_x_y=1.1

So compare Gaussian_yolo_spp_swish... vs Gaussian_yolo_spp_swish_scale_iou|_thresh...

LukeAI commented 4 years ago

I'm currently trying out Gaussian_yolo_spp_swish...

Also try then to add params to each of [Gaussian_yolo] layer and train. It requires the latest version of Darknet

iou_thresh=0.3
scale_x_y=1.1

So compare Gaussian_yolo_spp_swish... vs Gaussian_yolo_spp_swish_scale_iou|_thresh...

So add the same to every gaussian_yolo? As opposed to scale_x_y=1.05 , 1.1, 1.2 in the different layers?

LukeAI commented 4 years ago

what is the iou_thresh?

AlexeyAB commented 4 years ago

As opposed to scale_x_y=1.05 , 1.1, 1.2 in the different layers?

Or you can add different values for different yolo-layers: scale_x_y=1.05 (for 17x17) , 1.1 (34x34) 1.2 (68x68).

what is the iou_thresh?

Read about iou_thresh=0.213:

Our loss initially only paired the best anchor (out of all 9) to each target. We changed this to pair a target to any anchor above iou_thres = 0.213 (all paired anchors generate losses). So for example all 9 anchors might be paired or none at all. This change boosted mAP by about 0.07 to 0.47.

https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-553159098

LukeAI commented 4 years ago

Or you can add different values for different yolo-layers: scale_x_y=1.05 (for 17x17) , 1.1 (34x34) 1.2 (68x68).

Not sure what you mean by 17x17 etc. yolo layers - is that the scale on which they detect? Is 17x17 the first one in the cfg?

Read about iou_thresh=0.213: So should I set iou_thresh=0.213 or 3? which one is more promising IYO?

AlexeyAB commented 4 years ago

Not sure what you mean by 17x17

Yolo layer size.

Use scale_x_y=1.05 , 1.1, 1.2 in different layers as usual.

So should I set iou_thresh=0.213 or 3? which one is more promising IYO?

I don't know.

LukeAI commented 4 years ago

ok - should I be using giou loss for the iou threshold to be relevant - or is it independent of the loss function?

AlexeyAB commented 4 years ago

independent

LukeAI commented 4 years ago

OK yolo_gauss_spp_swish is looking strong. I've ended the training early to add iou and scale and start again. will report back shortly... gauss

LukeAI commented 4 years ago

Here is yolo_gaussian_spp_swish_scale_iou.cfg I went for iou_threshold=0.25

The addition of the IOU seemed to help it converge faster but not really clear if it ultimately led to higher mAP. Need to finish the above experiment.

I notice that the mAP doesn't stabilise until the learning rate steps down - around 40000 and 45000 - I'm not really sure how to interpret this - do you think that it might be better to step down the learning rate much earlier - maybe at 10,000 and 15000 or something?

yolo_gaussian_spp_swish_iou_scale.cfg.txt gauss

AlexeyAB commented 4 years ago

I notice that the mAP doesn't stabilise until the learning rate steps down - around 40000 and 45000 - I'm not really sure how to interpret this - do you think that it might be better to step down the learning rate much earlier - maybe at 10,000 and 15000 or something?

How many images do you have in training dataset? If you have only 20 000 images then set max_batches=20000 and steps at 15000, 18000

The addition of the IOU seemed to help it converge faster but not really clear if it ultimately led to higher mAP. Need to finish the above experiment.

Yes, may be it is only for the fast training, but not for the high mAP. Try Gaussian_yolo + swish, without iou_thresh. while the mAP increases.

Also try Gaussian_yolo + CIOU + swish

iou_normalizer=0.25
cls_normalizer=1.0
iou_loss=ciou
AlexeyAB commented 4 years ago

It is interesting, is Gaussian_yolo + CIOU better than Gaussian_yolo

LukeAI commented 4 years ago

My training set is only 14000 - I've edited the .cfg for steps at 18000, 19000 max_batched=22000 - going to see how that works out. I spliced the two graphs of the training without iou_thresh together above. So yeah, in this case iou_thresh=0.25 led to faster convergence but not higher final mAP - but my steps and max batches config wasn't that sensibly configured. I'll try CIOU shortly and report back. [EDIT] OK yeah I can confirm that using sensible batches and steps params gave me convergence to a higher mAP of 74%

LukeAI commented 4 years ago

Setting CIOU hurt mAP substantially in my case:

iou_loss=ciou gauss_ciou

Without CIOU gauss_psa

But I didn't set iou_normalizer=0.25 cls_normalizer=1.0 Is that needed for CIOU?

AlexeyAB commented 4 years ago

@LukeAI

Did you test [Gaussian_yolo] + CIoU or [yolo]+CIoU ?

Yes, they use https://github.com/Zzh-tju/DIoU-darknet/blob/master/cfg/coco-ciou.cfg#L811-L812

iou_normalizer=0.5
cls_normalizer=1.0

[EDIT] OK yeah I can confirm that using sensible batches and steps params gave me convergence to a higher mAP of 74%

What do you mean?

glenn-jocher commented 4 years ago

@LukeAI normalizers should not have much effect. As long as you use the same values between both comparisons then it should be a valid (apples to apples) comparison.

In my study I found CIoU and DIoU produced about the same results as GIoU. Actually I think I saw that CIoU/DIoU may produce slightly better results in the very early epochs (or on smaller datasets for more epochs), but on larger datasets with lots of iterations (COCO) it did not help at all.

LukeAI commented 4 years ago

This is the result for CIOU with iou_normalizer=0.25 cls_normalizer=1.0 A loss of about 1mAP compared to without CIOU or normalizers. gauss_ciou

AlexeyAB commented 4 years ago

@LukeAI @glenn-jocher

It seems that it necessary to reduce iou_normalizer= from 1.0 to 0.5 - 0.1 if there is used GIoU, DIoU or CIoU.

LukeAI commented 4 years ago

OK using iou_normalizer=0.25 cls_normalizer=1.0 with gaussian_spp_swish (without iou_thresh) and with the default loss function has given me the best result so far - the normalisers added +1mAP

yolo_gaussian_spp_swish_normalisers.cfg.txt

gauss_spp_swish_normalisers

AlexeyAB commented 4 years ago

@LukeAI

iou_normalizer= will affect only for losses C/D/GIoU: https://github.com/AlexeyAB/darknet/blob/31d483a2950a4d747b31f033e5725bb8de50ff1e/src/gaussian_yolo_layer.c#L243-L286

LukeAI commented 4 years ago

oh! so my improved mAP was just from cls_normalizer=1.0 ? Or... just random?

AlexeyAB commented 4 years ago

@LukeAI Just random.

cls_normalizer=1.0 is by default.

anamozov commented 4 years ago

I found original yolov3 and Gaussian yolo gave me almost the same mAP after more after around 35000 iterations for 54 class problem. I could't wait to finish training due to time problems. Gaussian_yolo chart yolov3 yolov3

nyj-ocean commented 4 years ago

@AlexeyAB I use my own dataset to train several models in this ropo. I train 70k steps. My dataset:1500 images for trainning set,other 1500 images for val set. I use val set to calculate mAP.

  best mAP Recall
yolov3 85.63 77
yolov3+Gaussian 85.97 80
yolov3+CIoU 85.29 75
yolov3+Gaussian+CIoU 86.4 81

yolov3+Gaussian+CIoU gets a 0.77% improvement in mAP and a 4% improvement in Recall compared to yolov3. I want to know these improvement in mAP and Recall of yolov3+Gaussian+CIoU are just for random or for the added module Gaussian+CIoU?

AlexeyAB commented 4 years ago

@nyj-ocean

I want to know these improvement in mAP and Recall of yolov3+Gaussian+CIoU are just for random or for the added module Gaussian+CIoU?

Just train again yolov3 and yolov3+Gaussian+CIoU

Also show chart.png of both models.

AlexeyAB commented 4 years ago

@LukeAI

iou_normalizer= will affect only for losses C/D/GIoU:

I fixed it.

Now iou_normalizer=0.1 and uc_normalizer=0.1 in [Gaussian_yolo] affect even without C/D/GIoU.

AlexeyAB commented 4 years ago

@WongKinYiu

It seems that we should use

  1. iou_loss=1 for [yolo]
  2. iou_loss=0.01 for [yolo] + C/D/GIoU (may be 0.07 )
  3. iou_loss=0.1 and uc_loss=0.1 for [Gaussian_yolo] (may be 0.3 )
  4. iou_loss=0.01 and uc_loss=0.01 for [Gaussian_yolo] + C/D/GIoU (may be 0.07 )

I won’t be surprised if the main effect of improving the AP75 (and decreasing AP50) is not from the good GIoU algorithm, but just from higher values ​​of iou_loss. I.e. the same effect we can achieve by using default [yolo] with default mse-loss but with iou_normalizer=10 or 100


  1. [yolo] has class_loss and iou_loss ~= 10

image


  1. [yolo]+GIoU has class_loss and iou_loss ~= 1000

image


  1. [Gaussian_yolo] has high iou_loss and uc_loss ~100

image


  1. [Gaussian_yolo] + GIoU has much higher iou_loss and uc_loss ~1000

image


HagegeR commented 4 years ago

so just multiplying the error is what make the network learn faster, but I think the way the loss is calculated result in better ap75

AlexeyAB commented 4 years ago

@HagegeR
A higher iou_loss shifts the priorities to more accurate IoU (coordinates and sizes) of the object (increases AP75), but to a less accurate prediction of the class_id (decreases AP50).

C/D/GIoU increases AP75, but decreases AP50.

glenn-jocher commented 4 years ago

Yes, normalizers have a tremendous effect on performance. In our repo we've spent thousands of GPU hours evolving these normalizers on COCO along with all of our other hyperparameters. See https://github.com/ultralytics/yolov3/issues/392

Our 3 main balancing hyperparameters are here, though they are changing every few weeks as new evolution results come in. They must change depending on image size, as well as class count, and occupancy (how many objects per image).

hyp = {'giou': 3.31,  # giou loss gain
       'cls': 42.4,  # cls loss gain
       'obj': 52.0,  # obj loss gain (*=img_size/416 if img_size != 416)}
glenn-jocher commented 4 years ago

I have updated mAPs now from https://github.com/ultralytics/yolov3#map using the default hyperparameters and default training settings on COCO2014, starting yolov3-spp.cfg from scratch.

Size COCO mAP
@0.5...0.95
COCO mAP
@0.5
YOLOv3-tiny
YOLOv3
YOLOv3-SPP
YOLOv3-SPP ultralytics
320 14.0
28.7
30.5
35.4
29.1
51.8
52.3
54.3
YOLOv3-tiny
YOLOv3
YOLOv3-SPP
YOLOv3-SPP ultralytics
416 16.0
31.2
33.9
39.0
33.0
55.4
56.9
59.2
YOLOv3-tiny
YOLOv3
YOLOv3-SPP
YOLOv3-SPP ultralytics
512 16.6
32.7
35.6
40.3
34.9
57.7
59.5
60.6
YOLOv3-tiny
YOLOv3
YOLOv3-SPP
YOLOv3-SPP ultralytics
608 16.6
33.1
37.0
40.9
35.4
58.2
60.7
60.9
AlexeyAB commented 4 years ago

@glenn-jocher

is YOLOv3-SPP with default hyperparameters giou=1 cls=1 obj=1, while YOLOv3-SPP ultralytics with these hyperparameters?

hyp = {'giou': 3.31, # giou loss gain 'cls': 42.4, # cls loss gain 'obj': 52.0, # obj loss gain (*=img_size/416 if img_size != 416)}

So good hyper parameters for GIoU:

[yolo]
iou_normalizer=0.07    # giou hyperparameter: 3.31 / ((52+42)/2) = 0.07
cls_normalizer=1.0     # obj hyperparameter
iou_loss=giou

img_size/416 is it image size or network size? Since every image will be resized to the network size.

glenn-jocher commented 4 years ago

@AlexeyAB 416 is the network size. So for example if the network is trained at 320, the objectness hyperparameter would be multiplied by 320/416. I found this helped keep the balance when network size changed.

Yes, roughly speaking we found an optimal GIoU gain is about 10 times smaller than cls and obj, and that obj and cls seem to optimally be of similar magnitude. I average the elements in each yolo layer, and sum the 3 values for the 3 yolo layers. So for example obj loss = mean(obj_loss_yolo_layer1) + mean(obj_loss_yolo_layer2) + mean(obj_loss_yolo_layer3). The chart below shows the loss after multiplying by the hyps:

results

glenn-jocher commented 4 years ago

I think the fundamental concept is that if each of the 3 loss components are equally important to the solution (and I believe they are), then they should roughly each be represented equally in the total loss (i.e. 1/3, 1/3, 1/3). This is probably the best place to start when in doubt, and then fine tune from there if you can.

AlexeyAB commented 4 years ago

@glenn-jocher

hyp = {'giou': 3.31, # giou loss gain 'cls': 42.4, # cls loss gain 'obj': 52.0, # obj loss gain (*=img_size/416 if img_size != 416)}

Are these hyperparameters good for C/D/GIoU only, or also the same values for the default MSE-loss?