Open AlexeyAB opened 5 years ago
I would try Dataset: Visdrone2019 1.cfg yolov3-tiny_3l.cfg.txt
Hope to see comments on these
MobileNet V2 Imagenet Top 1: 0.670320, top 5: 0.876620 Iterations: 600,000
https://github.com/AlexeyAB/darknet/files/3310574/mobilenet_v2.cfg.txt imagenet1k.data
mobilenetv2_last.weights https://drive.google.com/file/d/1lW87XQtZIYKIqu8DHvJwBUupJBh4Zpdl/view?usp=sharing
@ntd94 Thanks. What mAP (accuracy) did you get on Visdrone2019 val/test?
cfg: yolo-obj.cfg.txt
weights: https://bit.ly/2oa9Rt3
names
results: https://bit.ly/2nxWL8h
Any yolo_v3_tiny_pan3 aa_ae_mixup scale_giou weights? @AlexeyAB
Custom small dataset with 6 classes (crop and stems), images are approx. 2 MPix with a 4/3 ratio.
For each class are tagged the whole plant and its stem. The stem annotation is a square box approx. the same relative size for each class and image.
There is large overlap for some bounding boxes, some plants are very small and others almost fit half the image.
Class | Images | Annotations |
---|---|---|
Bean | 221 | 1118 |
Bean Stem | 221 | 1176 |
Maize | 489 | 889 |
Maize Stem | 489 | 927 |
Leek | 197 | 855 |
Leek Stem | 197 | 927 |
Model is trained for 10 000 steps using original parameters excepted for subdivisions
that is adjusted to fit on the GPU (Nvidia GTX 1060 6GB). For old models (Tiny, Tiny 3L, ...) width
and height
are changed to 544 instead of 416.
I Will update this post as soon as I have new results. I will also try to compare AlexeyAB Darknet implementation with other realtime frameworks (CenterNet, Ultralytics, RetinaNet, ...).
Model | Config Files | Training Chart | mAP@0.5 (scratch / pre-trained)* | mAP@[0.5...0.95] (scratch / pre-trained)* | FPS |
---|---|---|---|---|---|
Yolo v4 (random=0, resize=1.5) | cfg | — / 90.00% | — / 50.73% | 19fps | |
Yolo V3 CSR Spp Panet Optimal | cfg | — / 88.30% | — / 48.42% | 18fps | |
Yolo v4 Tiny | cfg | — / 88.76% | — / 46.74% | 146fps | |
Yolo V3 CSR Spp Panet | cfg | 83.38% / 88.34% | 42.21% / 45.88% | 18fps | |
Tiny Yolo V3 Pan 3 | cfg | 87.16% / 87.46% | 42.18% / 42.41% | 65fps | |
Tiny Yolo V3 Pan Mixup | cfg | 88.06% | 40.00% | 68fps | |
Tiny Yolo V3 Gaussian Matrix | cfg | 86.84% | 40.09% | — | |
Tiny Yolo V3 3L | cfg | 83.93% / 84.41% | 38.12% / 38.15% | 120fps | |
Tiny Yolo V3 Prn | cfg | 82.93% | 33.82% | 180fps | |
Tiny Yolo V3 | cfg | 82.39% | 33.28% | 140fps | |
Yolo V3 Spp Pan Scale | cfg | 75.95% | 31.76% | 13fps |
* Using pre-trained weights for deep networks such as yolo v3 and CSR significantly improves training stability and speed as well as the accuracy. This effect is less significant for shallower networks such as Tiny Yolo v3 Pan3.
Model | Config Files | Training Chart | mAP@0.5 | mAP@[0.5...0.95] | FPS |
---|---|---|---|---|---|
CenterNet dla-34 512x512* | — | — | 75.7% | 41.6% | 25fps |
Ultralytics Yolov3-SPP 512x512 | — | — | — | — |
* _Using pre-trained weights from ctdet_coco_dla_2x.pth
_
DataSet was one-class custom data with 33216 images for training and 4324 images for validation.
Experimental results will be updated continuously.
Model | CFG | AP@.5 | AP@.75 |
---|---|---|---|
spp,mse | cfg | 89.52% | 51.72% |
spp,mse,it=0.213 | - | 92.01% | 60.49% |
spp,giou(in=0.07) | - | 89.55% | 51.25% |
spp,giou(in=0.5) | - | 90.08% | 59.55% |
spp,ciou(in=0.5) | - | 89.49% | 58.39% |
spp,giou,gs(in,un=0.5) | cfg | 91.39% | 58.01% |
spp,ciou,gs(in,un=0.5) | - | 89.81% | 59.01% |
spp,giou,gs,mixup(in,un=0.5) | - | 89.51% | 58.73% |
spp,giou,gs,mosaic(in,un=.5) | - | 90.48% | 60.02% |
spp,giou,gs(in,un=0.5,it=0.213) | - | 91.89% | 63.53% |
spp,giou,gs,swish(in,un=0.5,it=0.213) | - | 91.54% | 60.70% |
spp,giou,gs,mosaic(in,un=0.5,it=0.213) | - | 91.82% | 63.89% |
spp,mse,it=0.213,asff(in,un=0.5,it=0.213) | cfg | 92.45% | 61.83% |
spp,giou,it=0.213,asff(in,un=0.5,it=0.213) | - | 92.33% | 63.74% |
spp,giou,it=0.213,asff(softmax),rfb(bn=0) | - | 91.57% | 64.95% |
spp,giou,it=0.213,asff(softmax),rfb(bn=1) | - | 92.32% | 60.05% |
spp,giou,it=0.213,asff(softmax),dropblock(size=7) ,rfb(bn=0) | - | 91.65% | 61.35% |
spp,giou,it=0.213,asff(softmax),dropblock(size=7) ,rfb(bn=1) | - | 92.12% | 63.88% |
A summary,
method | AP@.5 | AP@.75 |
---|---|---|
iou_thresh | + | + |
mse -> giou | - | + |
giou -> ciou | - | + |
gaussian_yolo | + | - |
mixup | - | - |
moasic | = | + |
swish | - | - |
asff | + | + |
rfb | ? | ? |
dropblock | ? | ? |
@Kyuuki93 Hi, Can you share the spp,giou,gs,mosaic(in,un=0.5,it=0.213) cfg file?
Thanks!
@Kyuuki93 Hi, Can you share the spp,giou,gs,mosaic(in,un=0.5,it=0.213) cfg file?
Thanks!
just download spp,giou,gs(in,un=0.07)
, then
iou_normalizer
and uc_normalizer
to 0.5
which now is 0.07
mosaic = 1
in [net]
sessioniou_thresh = 0.213
in all [Gaussion_yolo]
sessionIn my application, precision
, recall
, AP@0.5
with high thresh, e.g. thresh = 0.7 or 0.85, were more important, based on https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-561064425, create a new table here,
Model | AP@.5 | AP@.75 | precision (th=0.85) | recall(th=0.85) | precision (th=0.7) | recall(th=0.7) |
---|---|---|---|---|---|---|
spp,mse | 89.50% | 51.75% | 0.98 | 0.20 | 0.97 | 0.36 |
spp,giou | 90.09% | 59.55% | 0.98 | 0.25 | 0.97 | 0.40 |
spp,ciou | 89.88% | 58.39% | 0.99 | 0.22 | 0.97 | 0.38 |
spp,giou,gs | 91.39% | 58.01% | 0.99 | 0.05 | 0.97 | 0.47 |
spp,ciou,gs | 89.82% | 59.01% | 0.99 | 0.02 | 0.99 | 0.24 |
spp,giou,gs,mixup | 89.51% | 58.73% | 0.98 | 0.01 | 0.98 | 0.20 |
spp,giou,gs,mosaic | 90.47% | 63.89% | 1.00 | 0.01 | 0.99 | 0.22 |
spp,mse,it | 92.01% | 60.49% | 0.97 | 0.60 | 0.95 | 0.72 |
spp,giou,it | 91.79% | 63.09% | 0.96 | 0.57 | 0.95 | 0.71 |
spp,giou,it,igt=0.85 | 92.01% | 62.84% | 0.98 | 0.37 | 0.97 | 0.55 |
spp,giou,it,igt=1.0 | 91.43% | 61.82% | 0.98 | 0.21 | 0.97 | 0.43 |
spp,giou,it,tt=0.7 | nan | - | - | - | - | - |
spp,giou,it,tt=0.85 | 78.01% | 28.27% | 0.97 | 0.06 | 0.95 | 0.21 |
spp,giou,gs,it | 91.87% | 63.53% | 0.99 | 0.16 | 0.97 | 0.52 |
spp,mse,it,asff | 92.49% | 61.83% | 0.97 | 0.57 | 0.95 | 0.70 |
spp,giou,it,asff | ||||||
spp,giou,it,asff,rfb | 92.32% | 60.05% | 0.98 | 0.43 | 0.96 | 0.62 |
csresnext50-panet,spp,giou,it | 92.80% | 64.16% | 0.97 | 0.51 | 0.96 | 0.67 |
I would try Dataset: Visdrone2019 1.cfg yolov3-tiny_3l.cfg.txt
- weights https://drive.google.com/file/d/1fqWdpYpTJkKLnVhEiQQqm_SNvMp_qSE0/view?usp=sharing
- names visdrone2019.names.txt
Hope to see comments on these
I tried to my own drone video. the result is really good. Please advice how to get the similar performance ? for my purposes, I need to separate truck category into several truck type, person, car, bus and motorcycle. is there any dataset that could be used especially for aerial view image ?
@laclouis5: What is the difference between
mAP@0.5 (scratch / pre-trained)*
and mAP@[0.5...0.95] (scratch / pre-trained)*
?
From how I read it the mAP is both calculated at 50%.
@titanbender,
No, mAP@[0.5...0.95] is the average AP over ten different IoU thresholds ranging from 0.5 to 0.95 by 0.05 increment.
Here you can post your trained models on different Datasets - 3 files:
- cfg
- weights
- names
- Optional: accuracy
mAP@0.5
or/andmAP@0.5...0.95
- Optional: BFLOPS and/or Inference time (milliseconds)
COCO test-dev
Model Size BFLOPS Inference time, ms AP@.5:.95 AP@.5 AP@.75 URL yolo_v3_tiny_pan3 aa_ae_mixup scale_giou (no sgdr).txt 416x416 8.4 6.4 18.8% 36.8% 17.5% #3708 (comment) yolov3-tiny-prn.cfg.txt 416x416 3.5 3.8 - 33.1% - URL enet-coco.cfg.txt 416x416 3.7 22.7 - 45.5% - URL ImageNet valid
Model BFLOPs Inference Time (ms) Top1, % URL shufflenetv2 and weights 0.375 32 52% URL efficient_b0 and weights 0.915 76 71% URL MobileNet v2 and weights 0.85 - 67% URL
mobilenetv2 cfg file is error, stride 16 need repeat 7 times,but it‘s stride 8 repeat 7 times, this add the bflops
Pascal VOC 2007+2012 yolov3-tiny-prn with mosaic and ciou, input_size 352x352, 2.45BFlops, map@0.5=64.26% (backbone from yolov3-tiny-prn.weights for coco). yolov3-tiny-prn.cfg.txt and weights yolov3-tiny-prn_352@64.26.weights.zip
@laclouis5 hello there, is there a list / compilation of all those custom configs you used? i cannot seem to find them on cfg/ folder
@laclouis5 hello there, is there a list / compilation of all those custom configs you used? i cannot seem to find them on cfg/ folder
Hi @usamahjundia, all cfg files are attached in https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-549470673, 2nd column.
@laclouis5 hello there, is there a list / compilation of all those custom configs you used? i cannot seem to find them on cfg/ folder
Hi @usamahjundia, all cfg files are attached in #3874 (comment), 2nd column.
Thanks, but that is unfortunately not what i meant
What i meant was, where did you discover them in the first place? or are they custom-made?
Sorry for the confusion
@usamahjundia They are regular cfg files developed in this repo, I did not customise them. You can found them in the cfg folder of the repo.
@laclouis5 hello what is different with (scratch / pre-trained)? and Tiny Yolo V3 Pan 3 is show 87.16% / 87.46%, but the Training Chart is show the best map is 89%,this is why? thank.
@ShaneHsieh Error on my side, I updated the post with the good image.
Pre-trained is for models trained with initial weights from models trained on MS Coco. See the training tutorial of this repo for more information on how to do that. From scratch is without pre-trained weights.
@laclouis5 Thank.
Network | VOC mAP(0.5) | COCO mAP(0.5) | Resolution | Inference time (NCNN/Kirin 990) | Inference time (MNN arm82/Kirin 990) | FLOPS | Weight size |
---|---|---|---|---|---|---|---|
MobileNetV2-YOLOv3-Lite | 72.61 | 36.57 | 320 | 33 ms | 18 ms | 1.8BFlops | 8.0MB |
MobileNetV2-YOLOv3-Nano | 65.27 | 30.13 | 320 | 13 ms | 5 ms | 0.5BFlops | 3.0MB |
MobileNetV2-YOLOv3-Fastest | 46.55 | & | 320 | 8.2 ms | 2.4 ms | 0.13BFlops | 700kb |
Here you can post your trained models on different Datasets - 3 files:
mAP@0.5
or/andmAP@0.5...0.95
COCO test-dev
ImageNet valid