AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.95k forks source link

Train tiny-yolov2 with VOC dataset #2198

Open barry41409 opened 5 years ago

barry41409 commented 5 years ago

Hi, everyone I want to train tiny-yolov2 with VOC dataset. VOC image size is 416*416 and use ./cfg/tiny-yolo-voc.cfg.

Where can I find pre-trained weight for tiny-yolov2 with VOC dataset like below weight(darknet53.conv.74)? (darknet.exe detector train data/voc.data cfg/yolov3-voc.cfg darknet53.conv.74)

I can't find pre-trained weight, the accuracy of tiny-yolo is lower than paper(maybe 4X%) Does anyone train tiny-yolo model only using .cfg file? and the accuracy can achieve 57.1% from the https://pjreddie.com/darknet/yolov2/ . How can I modify the cfg file?

AlexeyAB commented 5 years ago

@barry41409 Hi,

How to get pre-trained weights file for yolov2-tiny: https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-train-tiny-yolo-to-detect-your-custom-objects


I want to train tiny-yolov2 with VOC dataset. VOC image size is 416*416 and use ./cfg/tiny-yolo-voc.cfg.

VOC images are not 416x416. And you shouldn't resize it to the 416x416 manually.

barry41409 commented 5 years ago

@AlexeyAB Thanks for your reply. https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-train-tiny-yolo-to-detect-your-custom-objects It is for tiny-yolo "v2" with VOC dataset?

The default weights file for tiny-yolo-voc(http://pjreddie.com/media/files/tiny-yolo-voc.weights) link can't access. Can you renew the link?

The cfg set the width and height 416, I resize the image to 416*416. [net] //Testing batch=1 subdivisions=1 //Training //batch=64 //subdivisions=2 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

If I want to train tiny-yolo v2 with VOC dataset, I don't resize the dataset and use default setting? The image in VOC2007 are different size, how can I modify the cfg?

barry41409 commented 5 years ago

hi @AlexeyAB, if I want to train tiny-yolo model with VOC by myself, I need pretrained weight to train model. https://github.com/AlexeyAB/darknet/issues/1877 This command creates weights file yolov2-tiny.conv.13 that is suitable only for Training, not for Detection:

  1. darknet partial cfg/yolov2-tiny-voc.cfg yolov2-tiny-voc.weights yolov2-tiny.conv.13 13
  2. darknet detector train data/my.data tiny-yolo-voc.cfg yolov2-tiny.conv.13

    After training, I can get same accuracy like paper and get new weight. Can I use this cmd?

What does partial function do? What is different from yolov2-tiny-voc.weights and darknet19_448.conv.23? The darknet19_448.conv.23 is from darknet and where is the yolov2-tiny-voc.weights from?

AlexeyAB commented 5 years ago

@barry41409 Hi,

partial saves only the first several layers


The darknet19_448.conv.23 is from darknet and where is the yolov2-tiny-voc.weights from?

yolov2-tiny-voc.weights from the site http://pjreddie.com/media/files/yolov2-tiny-voc.weights ) It is trained on Pascal VOC dataset to detect objects. While darknet19_448.weights is trained on ImageNet to classify images.

barry41409 commented 5 years ago

hi @AlexeyAB thank you! Now, I use yolov2-tiny.conv.13 with VOC but the mAP is 50.71%. ./darknet partial yolov2-tiny-voc.cfg yolov2-tiny-voc.weights yolov2-tiny.conv.13 13 ./darknet detector train voc.data yolov2-tiny-voc_train.cfg yolov2-tiny.conv.13 ./darknet detector map voc.data yolov2-tiny-voc_test.cfg backup/yolov2-tiny-voc_train_final.weights

If I reload yolov2-tiny-voc.weights for testing, the mAP is 55.57%. ./darknet detector map voc.data yolov2-tiny-voc_test.cfg yolov2-tiny-voc.weight

Why the mAP are so low? The website of YOLOv2 is 57.1%(mAP). Do I modify command or steps?

If I don't use yolov2-tiny.conv.13 for VOC, the mAP is too low. Do I must use yolov2-tiny.conv.13 for training? ./darknet detector train *.data yolov2-tiny-voc_train.cfg yolov2-tiny.conv.13 Do you know how to generate yolov2-tiny-voc.weights?

AlexeyAB commented 5 years ago

@barry41409

barry41409 commented 5 years ago

Hi, @AlexeyAB, thank you!

Do I must load yolov2-tiny.conv.13 for training with other dataset? ./darknet detector train *.data yolov2-tiny-voc_train.cfg yolov2-tiny.conv.13 Do you know how to generate yolov2-tiny-voc.weights?

AlexeyAB commented 5 years ago

Do you know how to generate yolov2-tiny-voc.weights?

You can download it: https://pjreddie.com/media/files/yolov2-tiny-voc.weights

Try to train by using batch=256 subdivisions=16

And

max_batches = 10200
policy=steps
steps=-1,25,5000,6000
scales=.1,10,.1,.1
barry41409 commented 5 years ago

hi @AlexeyAB the weight from https://pjreddie.com/media/files/yolov2-tiny-voc.weights is different from the website(https://pjreddie.com/darknet/yolov2/)?

Do I use partial cmd with https://pjreddie.com/media/files/yolov2-tiny-voc.weights or use the weight(https://pjreddie.com/media/files/yolov2-tiny-voc.weights) directly for training?

AlexeyAB commented 5 years ago

@barry41409

the weight from https://pjreddie.com/media/files/yolov2-tiny-voc.weights is different from the website(https://pjreddie.com/darknet/yolov2/)?

These files are the same.

to get yolov2-tiny.conv.13 use this command:

./darknet partial cfg/yolov2-tiny-voc.cfg yolov2-tiny-voc.weights yolov2-tiny.conv.13 13

barry41409 commented 5 years ago

Thank you @AlexeyAB I use the yolov2-tiny.conv.13 and modify cfg as bellow. batch=256 subdivisions=16 max_batches = 10200 policy=steps steps=-1,25,5000,6000 scales=.1,10,.1,.1 After training, I can get the mAP as same as the website and the yolov2-tiny-voc_final.weights as same as yolov2-tiny-voc.weights.

If I want to rotate image or do some image processing by myself, I must load yolov2-tiny.conv.13, cfg and train the model again. Which cfg do I use? the original or this new one. I know darknet can't support rotation.

barry41409 commented 5 years ago

hi @AlexeyAB ,

learning_rate=0.001 max_batches = 40200 policy=steps steps=-1,100,20000,30000 scales=.1,10,.1,.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

###########

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=125 activation=linear

[region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=20 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1



- This is training information.
#-----------------------------------------------------------------------
 10198: 0.295018, 0.242992 avg loss, 0.000010 rate, 1.236693 seconds, 2610688 images
Loaded: 0.000024 seconds
Region Avg IOU: 0.617308, Class: 0.965954, Obj: 0.000004, No Obj: 0.000127, Avg Recall: 0.735294,  count: 34
Region Avg IOU: 0.605852, Class: 0.966129, Obj: 0.000001, No Obj: 0.000130, Avg Recall: 0.697674,  count: 43
Region Avg IOU: 0.660676, Class: 0.899676, Obj: 0.000002, No Obj: 0.000117, Avg Recall: 0.766667,  count: 30
Region Avg IOU: 0.768213, Class: 0.985469, Obj: 0.000000, No Obj: 0.000115, Avg Recall: 1.000000,  count: 21
Region Avg IOU: 0.767267, Class: 0.975454, Obj: 0.000000, No Obj: 0.000155, Avg Recall: 0.840000,  count: 25
Region Avg IOU: 0.684546, Class: 0.997322, Obj: 0.000005, No Obj: 0.000142, Avg Recall: 0.857143,  count: 35
Region Avg IOU: 0.682567, Class: 0.967966, Obj: 0.000000, No Obj: 0.000145, Avg Recall: 0.825000,  count: 40
Region Avg IOU: 0.696913, Class: 0.968736, Obj: 0.000010, No Obj: 0.000147, Avg Recall: 0.857143,  count: 42
Region Avg IOU: 0.664944, Class: 0.972854, Obj: 0.000001, No Obj: 0.000231, Avg Recall: 0.795455,  count: 44
Region Avg IOU: 0.699654, Class: 0.964218, Obj: 0.000002, No Obj: 0.000142, Avg Recall: 0.872340,  count: 47
Region Avg IOU: 0.647608, Class: 0.940430, Obj: 0.000002, No Obj: 0.000102, Avg Recall: 0.830508,  count: 59
Region Avg IOU: 0.707712, Class: 0.970402, Obj: 0.000003, No Obj: 0.000107, Avg Recall: 0.941176,  count: 34
Region Avg IOU: 0.717876, Class: 0.995986, Obj: 0.000005, No Obj: 0.000134, Avg Recall: 0.916667,  count: 36
Region Avg IOU: 0.651359, Class: 0.997310, Obj: 0.000004, No Obj: 0.000142, Avg Recall: 0.837209,  count: 43
Region Avg IOU: 0.689263, Class: 0.969713, Obj: 0.000001, No Obj: 0.000195, Avg Recall: 0.805556,  count: 36
Region Avg IOU: 0.641250, Class: 0.933772, Obj: 0.000002, No Obj: 0.000134, Avg Recall: 0.743590,  count: 39

 10199: 0.262016, 0.244894 avg loss, 0.000010 rate, 1.238252 seconds, 2610944 images
Loaded: 0.000023 seconds
Region Avg IOU: 0.613651, Class: 0.975326, Obj: 0.000001, No Obj: 0.000105, Avg Recall: 0.694915,  count: 59
Region Avg IOU: 0.588049, Class: 0.971047, Obj: 0.000002, No Obj: 0.000136, Avg Recall: 0.666667,  count: 57
Region Avg IOU: 0.694998, Class: 0.980136, Obj: 0.000001, No Obj: 0.000148, Avg Recall: 0.837838,  count: 37
Region Avg IOU: 0.672738, Class: 0.971762, Obj: 0.000001, No Obj: 0.000135, Avg Recall: 0.895833,  count: 48
Region Avg IOU: 0.712654, Class: 0.992242, Obj: 0.000003, No Obj: 0.000136, Avg Recall: 0.870968,  count: 31
Region Avg IOU: 0.637567, Class: 0.992207, Obj: 0.000001, No Obj: 0.000135, Avg Recall: 0.727273,  count: 33
Region Avg IOU: 0.702900, Class: 0.982232, Obj: 0.000002, No Obj: 0.000154, Avg Recall: 0.891892,  count: 37
Region Avg IOU: 0.578501, Class: 0.943660, Obj: 0.000003, No Obj: 0.000160, Avg Recall: 0.682540,  count: 63
Region Avg IOU: 0.584042, Class: 0.987625, Obj: 0.000001, No Obj: 0.000110, Avg Recall: 0.741935,  count: 31
Region Avg IOU: 0.665660, Class: 0.963658, Obj: 0.000003, No Obj: 0.000109, Avg Recall: 0.854167,  count: 48
Region Avg IOU: 0.645780, Class: 0.992747, Obj: 0.000000, No Obj: 0.000139, Avg Recall: 0.766667,  count: 30
Region Avg IOU: 0.644958, Class: 0.928764, Obj: 0.000002, No Obj: 0.000147, Avg Recall: 0.760870,  count: 46
Region Avg IOU: 0.632909, Class: 0.938778, Obj: 0.000001, No Obj: 0.000118, Avg Recall: 0.756098,  count: 41
Region Avg IOU: 0.623197, Class: 0.952287, Obj: 0.000001, No Obj: 0.000101, Avg Recall: 0.738095,  count: 42
Region Avg IOU: 0.575555, Class: 0.976537, Obj: 0.000000, No Obj: 0.000155, Avg Recall: 0.648649,  count: 37
Region Avg IOU: 0.664603, Class: 0.996228, Obj: 0.000001, No Obj: 0.000152, Avg Recall: 0.803922,  count: 51

 10200: 0.356095, 0.256014 avg loss, 0.000010 rate, 1.240131 seconds, 2611200 images
#-----------------------------------------------------------------------

- When I test the mAP, the result is 0.
#-----------------------------------------------------------------------
mask_scale: Using default '1.000000'
Total BFLOPS 6.977 
Loading weights from backup_alex/yolov2-tiny-voc_train_new_final.weights...
 seen 64 
Done!

 calculation mAP (mean average precision)...
4952
 detections_count = 0, unique_truth_count = 12032  
class_id = 0, name = aeroplane,      ap = 0.00 % 
class_id = 1, name = bicycle,    ap = 0.00 % 
class_id = 2, name = bird,   ap = 0.00 % 
class_id = 3, name = boat,   ap = 0.00 % 
class_id = 4, name = bottle,     ap = 0.00 % 
class_id = 5, name = bus,    ap = 0.00 % 
class_id = 6, name = car,    ap = 0.00 % 
class_id = 7, name = cat,    ap = 0.00 % 
class_id = 8, name = chair,      ap = 0.00 % 
class_id = 9, name = cow,    ap = 0.00 % 
class_id = 10, name = diningtable,   ap = 0.00 % 
class_id = 11, name = dog,   ap = 0.00 % 
class_id = 12, name = horse,     ap = 0.00 % 
class_id = 13, name = motorbike,     ap = 0.00 % 
class_id = 14, name = person,    ap = 0.00 % 
class_id = 15, name = pottedplant,   ap = 0.00 % 
class_id = 16, name = sheep,     ap = 0.00 % 
class_id = 17, name = sofa,      ap = 0.00 % 
class_id = 18, name = train,     ap = 0.00 % 
class_id = 19, name = tvmonitor,     ap = 0.00 % 
 for thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan 
 for thresh = 0.25, TP = 0, FP = 0, FN = 12032, average IoU = 0.00 % 

 mean average precision (mAP) = 0.000000, or 0.00 % 
#-----------------------------------------------------------------------
(./darknet detector map source_jhan/voc.data source_jhan/yolov2-tiny-voc_test_new.cfg backup_alex/yolov2-tiny-voc_train_new_final.weights)

- for test cfg, I modify batch=1 and subdivisions=1, the others are same as train cfg.
If I load other final_weight from original cfg, the mAP is 50.38%.

Are the 'Obj' and 'No Obj' training information too low?
How can I fix the problem?

I want to train tinyyolov2 with VOC2007+2012 first( get the mAP like the website),
then do other image processing for VOC dataset and train again.
barry41409 commented 5 years ago

Hi @AlexeyAB , Do I provide any information for training? Do you have any suggestion for this problem? Thanks for your help.

AlexeyAB commented 5 years ago

@barry41409 Hi,

for thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan for thresh = 0.25, TP = 0, FP = 0, FN = 12032, average IoU = 0.00 %

mean average precision (mAP) = 0.000000, or 0.00 %

It looks like your Training or Validation dataset is wrong. Try to run and show the result darknet.exe detector calc_anchors data/voc.data -width 416 -height 416

Then set train=valid.txt and run this command again and show me result.

barry41409 commented 5 years ago

hi, @AlexeyAB Thank you. I run the cmd as below. ./darknet detector calc_anchors my/voc.data -width 416 -height 416

num_of_clusters = 5, width = 416, height = 416 
 read labels from 2501 images 
 loaded      image: 2501     box: 6301
 all loaded. 

 calculating k-means++ ...

 avg IoU = 61.74 % 

Saving anchors to the file: anchors.txt 
anchors =  39, 71,  89,147, 139,287, 245,170, 318,336

then set the train=valid in voc.data. ./darknet detector calc_anchors my/voc.data -width 416 -height 416

 num_of_clusters = 5, width = 416, height = 416 
 read labels from 4952 images 
 loaded      image: 4952     box: 12032
 all loaded. 

 calculating k-means++ ...

 avg IoU = 62.18 % 

Saving anchors to the file: anchors.txt 
anchors =  39, 71,  91,149, 141,288, 252,175, 330,337
AlexeyAB commented 5 years ago

@barry41409 Your labels of Training and Validations datasets are correct.

Try to train from the begining by using such command: ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train_new.cfg source_jhan/yolov2-tiny.conv.13 -map

barry41409 commented 5 years ago

hi @AlexeyAB ,thank you. I use latest version, Does it renew recently? Here is my cfg data, link

I am running the cmd ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train_new.cfg source_jhan/yolov2-tiny.conv.13 -map

AlexeyAB commented 5 years ago

@barry41409 It was renewed 4 hours ago: https://github.com/AlexeyAB/darknet/commits/master

I am running the cmd ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train_new.cfg source_jhan/yolov2-tiny.conv.13 -map

So after each 100 iterations will be created chart.png image. After 4000 iterations drag-n-drop it to your message.


Also try to train with default

max_batches = 40200
policy=steps
steps=-1,100,20000,30000
scales=.1,10,.1,.1
barry41409 commented 5 years ago

hi @AlexeyAB ,

  1. use yolov2-tiny-voc_train_new.cfg

    batch=256
    subdivisions=16
    max_batches = 10200
    policy=steps
    steps=-1,25,5000,6000
    scales=.1,10,.1,.1

    ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train_new.cfg source_jhan/yolov2-tiny.conv.13 -map result (sorry, the result was break because the memory error I reload the last weight and continue training.)

  2. use my cfg

    batch=64
    subdivisions=4
    max_batches = 40200
    policy=steps
    steps=-1,100,20000,30000
    scales=.1,10,.1,.1

    ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train.cfg source_jhan/yolov2-tiny.conv.13 -map result

I find subdivisions=4 in my cfg different from subdivisions=2 in original cfg. (If using subdivisions=2 in cfg, I have 'memory error') Does subdivisions effect the result? Have you trained tiny-yolov2? We can cross validation for the cfg file.

AlexeyAB commented 5 years ago

@barry41409 Hi,

Does subdivisions effect the result?

Almost no.

Have you trained tiny-yolov2?

For my own datasets.

barry41409 commented 5 years ago

@AlexeyAB Thanks, Dose my result,cfg correct? How many classes are in your dataset and how many images are in valid and train? Do you have any idea or suggestion about training tiny-yolov2 with voc2007&2012?

I have problem about partial, If I use same dataset VOC2007&2012 and same class in cfg, Why do I partial 13 layer pretrained weight for tiny yolov2?

yolov2-tiny-voc.weights - is Detector (Yolo v2 tiny) that says what object are on the image and where Training with yolov2-tiny-voc.weights, the result is lower than the website(mAP: 57.1%). Is it normal?

AlexeyAB commented 5 years ago

@barry41409

Why do I partial 13 layer pretrained weight for tiny yolov2?

Because remaining layers can depend on classes that can be different between default yolov2-tiny-voc.weights and your.

Training with yolov2-tiny-voc.weights, the result is lower than the website(mAP: 57.1%). Is it normal?

What mAP can you get?

barry41409 commented 5 years ago

hi @AlexeyAB ,

This is my training result.

  1. testing with original yolov2-tiny-voc.weights

    ./darknet detector map source_jhan/voc.data source_jhan/yolov2-tiny-voc_test.cfg yolov2-tiny-voc.weights 
    mAP---->55.57%
  2. training with yolov2-tiny-voc.weights, testing reload the final weight from training

    ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train.cfg yolov2-tiny-voc.weights 
    ./darknet detector map source_jhan/voc.data source_jhan/yolov2-tiny-voc_test.cfg backup/yolov2-tiny-voc_train_final.weights 
    mAP---->56.49%
  3. training with partial yolov2-tiny-voc.weights, testing reload the final weight from training

    ./darknet partial source_jhan/yolov2-tiny-voc.cfg yolov2-tiny-voc.weights yolov2-tiny.conv.13 13
    ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train.cfg yolov2-tiny.conv.13 13 
    ./darknet detector map source_jhan/voc.data source_jhan/yolov2-tiny-voc_test.cfg backup/yolov2-tiny-voc_train_final.weights 
    mAP---->50.71%
  1. training without any weight, testing reload the final weight from training
    ./darknet detector train source_jhan/voc.data source_jhan/yolov2-tiny-voc_train.cfg
    ./darknet detector map source_jhan/voc.data source_jhan/yolov2-tiny-voc_test.cfg backup_noreload/yolov2-tiny-voc_train_final.weights
    mAP---->30.8%

I restart training tiny yolov2 with VOC2007&2012(class is same with website), which weight and cfg do I use? That can achieve the mAP like the website. (How can I get the final weight(yolov2-tiny-voc.weights) by training tiny-yolov2?)

If I want to train custom dataset, I should use partial weight and modify cfg. Do I modify training policy about steps, scales....?

barry41409 commented 5 years ago

@AlexeyAB hi, I also use coco pretrained weight, and partial get 1~13 layer for training. The mAP is 50.38%.

What do you detect for tiny yolov2 in your dataset?