Open lbrito55 opened 5 years ago
@lbrito55 Hi,
Try to use default anchors and use more suitable network size
Hi @lbrito55 ,
How did you produce those graphs?
Thanks.
What network size do you use?
Also can you show clound of points?
./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 -show
@iraadit im using tensorboard!
@AlexeyAB anchors = 16, 44, 37, 49, 25,100, 98, 76, 126,135, 154,205, 175,263, 177,330, 179,399;
yolo-spp.cfg default anchors i think, do you mean a cluster like graph?,
Im around 12k epochs for 6 classes and recently, it reached a new best maP. The loss curve its still fluctuating, how long should i keep training it without overfitting?
Thanks,
@AlexeyAB
Do you use default network size width=608 height=608 in cfg-file?
Also show calculated anchors:
./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416
@AlexeyAB yes im using 608x608
anchors = 16, 44, 37, 49, 25,100, 98, 76, 126,135, 154,205, 175,263, 177,330, 179,399
Im around 12k epochs for 6 classes and recently, it reached a new best maP. The loss curve its still fluctuating, how long should i keep training it without overfitting?
While mAp on Validation dataset still increases.
Can you show examples of images with your objects?
You can try to train this model, just change width=640 height=640 (may be increase subdivisions), change max_batches=, classes= and filters=
filters = (classes + 8 + 1) * <numbers in mask>
yolo_v3_tiny_pan5_matrix_gaussian_GIoU_aa_ae_mixup.cfg.txt
And train at least 7000 iterations.
@AlexeyAB do you have an specific version for yolov3 that i can use without using a tiny yolo?
You can try to train this model, just change width=608 height=608 (may be increase subdivisions), change max_batches=, classes= and filters= filters = (classes + 8 + 1) *
yolo_v3_tiny_pan5_matrix_gaussian_GIoU_aa_ae_mixup.cfg.txt
What does it mean "without tiny" ?
@AlexeyAB Nevermind, lemme process with the new cfg and ill post the output
Just check the mAP after 7000 iterations
@AlexeyAB im getting the following error when using the latest commit of the repo and using this cfg
yolo_v3_tiny_pan5_matrix_gaussian_GIoU_aa_ae_mixup.cfg (2).txt
for 6 classes
compute_capability = 610, cudnn_half = 0 batch = 1, time_steps = 1, train = 1 layer filters size/strd(dil) input output 0 conv 24 3 x 3/ 1 608 x 608 x 3 -> 608 x 608 x 24 0.479 BF 1 max 2x 2/ 2 608 x 608 x 24 -> 304 x 304 x 24 0.009 BF 2 conv 8 5 x 5/ 1 304 x 304 x 24 -> 304 x 304 x 8 0.887 BF 3 route 0 -> 608 x 608 x 24 4 max 2x 2/ 2x 1 608 x 608 x 24 -> 304 x 608 x 24 0.018 BF 5 convS 8 5x 5/ 1x 2 304 x 608 x 24 -> 304 x 304 x 8 0.887 BF 6 route 0 -> 608 x 608 x 24 7 max 2x 2/ 1x 2 608 x 608 x 24 -> 608 x 304 x 24 0.018 BF 8 convS 8 5x 5/ 2x 1 608 x 304 x 24 -> 304 x 304 x 8 0.887 BF 9 route 0 -> 608 x 608 x 24 10 convS 8 5 x 5/ 1 608 x 608 x 24 -> 608 x 608 x 8 3.549 BF 11 reorg / 2 608 x 608 x 8 -> 304 x 304 x 32 12 route 11 8 5 2 -> 304 x 304 x 56 13 conv 32 1 x 1/ 1 304 x 304 x 56 -> 304 x 304 x 32 0.331 BF 14 conv 32 3 x 3/ 1 304 x 304 x 32 -> 304 x 304 x 32 1.703 BF 15 max 2x 2/ 2 304 x 304 x 32 -> 152 x 152 x 32 0.003 BF 16 conv 16 5 x 5/ 1 152 x 152 x 32 -> 152 x 152 x 16 0.591 BF 17 route 14 -> 304 x 304 x 32 18 max 2x 2/ 2x 1 304 x 304 x 32 -> 152 x 304 x 32 0.006 BF 19 convS 16 5x 5/ 1x 2 152 x 304 x 32 -> 152 x 152 x 16 0.591 BF 20 route 14 -> 304 x 304 x 32 21 max 2x 2/ 1x 2 304 x 304 x 32 -> 304 x 152 x 32 0.006 BF 22 convS 16 5x 5/ 2x 1 304 x 152 x 32 -> 152 x 152 x 16 0.591 BF 23 route 14 -> 304 x 304 x 32 24 convS 16 5 x 5/ 1 304 x 304 x 32 -> 304 x 304 x 16 2.366 BF 25 reorg / 2 304 x 304 x 16 -> 152 x 152 x 64 26 route 25 22 19 16 -> 152 x 152 x 112 27 conv 32 1 x 1/ 1 152 x 152 x 112 -> 152 x 152 x 32 0.166 BF 28 conv 64 3 x 3/ 1 152 x 152 x 32 -> 152 x 152 x 64 0.852 BF 29 max 2x 2/ 2 152 x 152 x 64 -> 76 x 76 x 64 0.001 BF 30 conv 32 5 x 5/ 1 76 x 76 x 64 -> 76 x 76 x 32 0.591 BF 31 route 28 -> 152 x 152 x 64 32 max 2x 2/ 2x 1 152 x 152 x 64 -> 76 x 152 x 64 0.003 BF 33 convS 32 5x 5/ 1x 2 76 x 152 x 64 -> 76 x 76 x 32 0.591 BF 34 route 28 -> 152 x 152 x 64 35 max 2x 2/ 1x 2 152 x 152 x 64 -> 152 x 76 x 64 0.003 BF 36 convS 32 5x 5/ 2x 1 152 x 76 x 64 -> 76 x 76 x 32 0.591 BF 37 route 28 -> 152 x 152 x 64 38 convS 32 5 x 5/ 1 152 x 152 x 64 -> 152 x 152 x 32 2.366 BF 39 reorg / 2 152 x 152 x 32 -> 76 x 76 x 128 40 route 39 36 33 30 -> 76 x 76 x 224 41 conv 64 1 x 1/ 1 76 x 76 x 224 -> 76 x 76 x 64 0.166 BF 42 conv 128 3 x 3/ 1 76 x 76 x 64 -> 76 x 76 x 128 0.852 BF 43 max 2x 2/ 2 76 x 76 x 128 -> 38 x 38 x 128 0.001 BF 44 conv 64 5 x 5/ 1 38 x 38 x 128 -> 38 x 38 x 64 0.591 BF 45 route 42 -> 76 x 76 x 128 46 max 2x 2/ 2x 1 76 x 76 x 128 -> 38 x 76 x 128 0.001 BF 47 convS 64 5x 5/ 1x 2 38 x 76 x 128 -> 38 x 38 x 64 0.591 BF 48 route 42 -> 76 x 76 x 128 49 max 2x 2/ 1x 2 76 x 76 x 128 -> 76 x 38 x 128 0.001 BF 50 convS 64 5x 5/ 2x 1 76 x 38 x 128 -> 38 x 38 x 64 0.591 BF 51 route 42 -> 76 x 76 x 128 52 convS 64 5 x 5/ 1 76 x 76 x 128 -> 76 x 76 x 64 2.366 BF 53 reorg / 2 76 x 76 x 64 -> 38 x 38 x 256 54 route 53 50 47 44 -> 38 x 38 x 448 55 conv 128 1 x 1/ 1 38 x 38 x 448 -> 38 x 38 x 128 0.166 BF 56 conv 256 3 x 3/ 1 38 x 38 x 128 -> 38 x 38 x 256 0.852 BF 57 max 2x 2/ 2 38 x 38 x 256 -> 19 x 19 x 256 0.000 BF 58 conv 128 5 x 5/ 1 19 x 19 x 256 -> 19 x 19 x 128 0.591 BF 59 route 56 -> 38 x 38 x 256 60 max 2x 2/ 2x 1 38 x 38 x 256 -> 19 x 38 x 256 0.001 BF 61 convS 128 5x 5/ 1x 2 19 x 38 x 256 -> 19 x 19 x 128 0.591 BF 62 route 56 -> 38 x 38 x 256 63 max 2x 2/ 1x 2 38 x 38 x 256 -> 38 x 19 x 256 0.001 BF 64 convS 128 5x 5/ 2x 1 38 x 19 x 256 -> 19 x 19 x 128 0.591 BF 65 route 56 -> 38 x 38 x 256 66 convS 128 5 x 5/ 1 38 x 38 x 256 -> 38 x 38 x 128 2.366 BF 67 reorg / 2 38 x 38 x 128 -> 19 x 19 x 512 68 route 67 64 61 58 -> 19 x 19 x 896 69 conv 256 1 x 1/ 1 19 x 19 x 896 -> 19 x 19 x 256 0.166 BF 70 conv 512 3 x 3/ 1 19 x 19 x 256 -> 19 x 19 x 512 0.852 BF 71 conv 256 1 x 1/ 1 19 x 19 x 512 -> 19 x 19 x 256 0.095 BF 72 max 5x 5/ 1 19 x 19 x 256 -> 19 x 19 x 256 0.002 BF 73 route 71 -> 19 x 19 x 256 74 max 9x 9/ 1 19 x 19 x 256 -> 19 x 19 x 256 0.007 BF 75 route 71 -> 19 x 19 x 256 76 max 13x13/ 1 19 x 19 x 256 -> 19 x 19 x 256 0.016 BF 77 route 76 74 72 71 -> 19 x 19 x1024 78 conv 256 1 x 1/ 1 19 x 19 x1024 -> 19 x 19 x 256 0.189 BF 79 Shortcut Layer: 69 80 conv 1024 3 x 3/ 1 19 x 19 x 256 -> 19 x 19 x1024 1.703 BF 81 conv 256 1 x 1/ 1 19 x 19 x1024 -> 19 x 19 x 256 0.189 BF 82 conv 512 3 x 3/ 1 19 x 19 x 256 -> 19 x 19 x 512 0.852 BF 83 max 2x 2/ 2 19 x 19 x 512 -> 10 x 10 x 512 0.000 BF 84 conv 128 5 x 5/ 1 10 x 10 x 512 -> 10 x 10 x 128 0.328 BF 85 route 82 -> 19 x 19 x 512 86 max 2x 2/ 2x 1 19 x 19 x 512 -> 10 x 19 x 512 0.000 BF 87 convS 128 5x 5/ 1x 2 10 x 19 x 512 -> 10 x 10 x 128 0.328 BF 88 route 82 -> 19 x 19 x 512 89 max 2x 2/ 1x 2 19 x 19 x 512 -> 19 x 10 x 512 0.000 BF 90 convS 128 5x 5/ 2x 1 19 x 10 x 512 -> 10 x 10 x 128 0.328 BF 91 route 82 -> 19 x 19 x 512 92 convS 128 5 x 5/ 1 19 x 19 x 512 -> 19 x 19 x 128 1.183 BF 93 reorg / 2 19 x 19 x 128 -> 9 x 9 x 512 94 route 93 90 87 84 -> 0 x 0 x 0 95 Layer before convolutional layer must output image.: File exists darknet: ./src/utils.c:295: error: Assertion `0' failed. Aborted (core dumped)
Any ideas?
@lbrito55 Oh yeah, set 640x640 or 576x576 or 512x512 in cfg-file.
@AlexeyAB using this configuration I’m around 2k epochs and its fluctuating around 150 and 70 the loss value... is this normal?
How this new .cfg file is better? I’m just curious
Thanks
What mAP do you get with this cfg-file?
It lasted the whole night training, and tried to sort 30k boxes in this section of the code [Max suppression algorithm] if (nms) do_nms...
For a dataset with 2k images.. So couldn’t get the map done, am i doing something wrong?
It lasted the whole night training, and tried to sort 30k boxes in this section of the code [Max suppression algorithm] if (nms) do_nms...
For a dataset with 2k images.. So couldn’t get the map done, am i doing something wrong?
Do you mean that command ./darknet detector map ..
took whole night for 2000 images in your validation dataset?
Yes!
Here's the cfg-file I'm using: mycfg.cfg.txt The model doesn't detect anything meaningful so far. I keep training, and it seems the loss is increasing
You should be able to detect anything after 1000 iterations and get normal accuracy after 4000 iteration. Loss doesn't matter.
So you can try to train this model: yolo_v3_tiny_pan5_matrix_gaussian_GIoU_aa_ae_mixup.cfg.txt
or this one: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt
I was using weights from the ~500 iteration somehow. I tried again with weights from +2000 iterations and everything seems to work just fine at the moment. I'll leave it training more iterations to compare results. Sorry for that inconvenience, I'll let you know if anything. Thanks.
I was using weights from the ~500 iteration somehow.
https://github.com/AlexeyAB/darknet/issues/4325#issuecomment-556017979
And train at least 7000 iterations.
https://github.com/AlexeyAB/darknet#when-should-i-stop-training
Usually sufficient 2000 iterations for each class(object), but not less than 4000 iterations in total.
@lbrito55 Can i ask how can you use tensorboard in darknet repo. I trained in AWS and the web loss is not given a lot of information. Tks in advance
Hey @AlexeyAB
Ive been training a model to detect images from cctv cameras, using yolov3-spp.cfg.
At the moment i have around 10k images and 6 classes and im running a k-fold like training setup, and not the original 80/20 split.. because of my dataset size.
i've achieved ~70%mAP with this current setup
Any thoughts on how can i improve?
Thanks in advance
LB