Closed sallaben closed 3 years ago
* show chart.png with Loss and mAP
* check your dataset - run training with flag `-show_imgs` i.e. `./darknet detector train ... -show_imgs` and look at the `aug_...jpg` images, do you see correct truth bounded boxes?
* rename your cfg-file to txt-file and drag-n-drop (attach) to your message here
* show content of generated files `bad.list` and `bad_label.list` if they exist
- show chart.png with Loss and mAP
- check your dataset - run training with flag
-show_imgs
i.e../darknet detector train ... -show_imgs
and look at theaug_...jpg
images, do you see correct truth bounded boxes?- rename your cfg-file to txt-file and drag-n-drop (attach) to your message here
- show content of generated files
bad.list
andbad_label.list
if they exist
@AlexeyAB
Thank you so much for the quick response! I love this model and what I have seen it accomplish so far. I just hope to get it working myself.
1 I can't run it long enough to get a good looking chart.png with avg loss. It gives nan eventually
Total BFLOPS 8.394
avg_outputs = 370590
Loading weights from /Users/sallaben/Downloads/yolov4-tiny.conv.29...
seen 64, trained: 0 K-images (0 Kilo-batches_64)
Done! Loaded 29 layers from weights-file
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
If error occurs - run training with flag: -dont_show
Create 6 permanent cpu-threads
Loaded: 0.761473 seconds
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.509568, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 179.029510, iou_loss = 0.000000, total_loss = 179.029510
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.491192, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 649.666992, iou_loss = 0.000000, total_loss = 649.666992
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.510905, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 178.849594, iou_loss = 0.000000, total_loss = 178.849594
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.266024, GIOU: 0.131675), Class: 0.449302, Obj: 0.449228, No Obj: 0.490926, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 650.961304, iou_loss = 1.972961, total_loss = 652.934265
total_bbox = 2, rewritten_bbox = 0.000000 %
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.512936, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 179.455246, iou_loss = 0.000000, total_loss = 179.455246
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.199529, GIOU: -0.183834), Class: 0.436063, Obj: 0.208874, No Obj: 0.490420, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 647.242554, iou_loss = 0.068542, total_loss = 647.311096
total_bbox = 3, rewritten_bbox = 0.000000 %
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.312126, GIOU: 0.311749), Class: 0.391792, Obj: 0.614028, No Obj: 0.510468, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 178.836639, iou_loss = 0.074936, total_loss = 178.911575
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.405204, GIOU: 0.305772), Class: 0.231199, Obj: 0.441184, No Obj: 0.491386, .5R: 0.500000, .75R: 0.000000, count: 2, class_loss = 648.380981, iou_loss = 0.245911, total_loss = 648.626892
total_bbox = 6, rewritten_bbox = 0.000000 %
and eventually:
Loaded: 0.000104 seconds
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.426217, GIOU: 0.341448), Class: 0.405919, Obj: 0.000000, No Obj: 0.000000, .5R: 0.500000, .75R: 0.000000, count: 2, class_loss = 1.386838, iou_loss = 0.781312, total_loss = 2.168149
total_bbox = 71, rewritten_bbox = 0.000000 %
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.525310, GIOU: 0.475973), Class: 0.365586, Obj: 0.000000, No Obj: 0.000000, .5R: 0.500000, .75R: 0.000000, count: 2, class_loss = 1.402588, iou_loss = 1.404916, total_loss = 2.807504
total_bbox = 73, rewritten_bbox = 0.000000 %
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.300243, GIOU: 0.208052), Class: 0.540854, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.605407, iou_loss = 0.279533, total_loss = 0.884941
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.167717, GIOU: 0.167717), Class: 0.378715, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.692998, iou_loss = 0.020400, total_loss = 0.713398
total_bbox = 75, rewritten_bbox = 0.000000 %
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 30 Avg (IOU: 0.794809, GIOU: 0.794716), Class: 0.497840, Obj: 0.000000, No Obj: 0.000000, .5R: 1.000000, .75R: 1.000000, count: 1, class_loss = 0.626082, iou_loss = 0.540426, total_loss = 1.166508
v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 37 Avg (IOU: 0.429721, GIOU: 0.421818), Class: 0.525618, Obj: 0.000000, No Obj: 0.000000, .5R: 0.500000, .75R: 0.000000, count: 2, class_loss = 1.231544, iou_loss = 1.166338, total_loss = 2.397882
total_bbox = 78, rewritten_bbox = 0.000000 %
Eventually it ends up with "nan" average but I am able to get a saved weights file before then.
2: Yes the boxes appear correct in the images that should have them.
3: Attached! yolov4-tiny-map.txt
4: Those files do not exist.
show chart.png with Loss and mAP
learning_rate=0.001 burn_in=15 max_batches = 50 policy=steps steps=25 scales=.1,.1
https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
change line max_batches to (classes*2000 but not less than number of training images, but not less than number of training images and not less than 6000), f.e. max_batches=6000 if you train for 3 classes
learning_rate=0.001 burn_in=15 max_batches = 50 policy=steps steps=25 scales=.1,.1
https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
change line max_batches to (classes*2000 but not less than number of training images, but not less than number of training images and not less than 6000), f.e. max_batches=6000 if you train for 3 classes
@AlexeyAB
I tried this - and it gives me NaN by the 15th iteration. (And I cannot generate a chart.png for that reason). The average loss by that iteration is quite low so maybe it just doesn't need that many iterations?
because you must use burn_in=1000
learning_rate=0.00261 burn_in=1000 max_batches = 6000 policy=steps steps=4000,5000 scales=.1,.1
learning_rate=0.00261 burn_in=1000 max_batches = 6000 policy=steps steps=4000,5000 scales=.1,.1
@AlexeyAB
Thank you for the example config setting. In my case, I ran train with those settings, and it still gave me avg loss=NaN by iteration 13. Are there other reasons that might cause this issue?
@AlexeyAB
I've uploaded my code here if this helps understand my problem: https://github.com/sallaben/yolors
Compile Darknet with GPU
@AlexeyAB I compiled with GPU and I've gotten further now. I have a question, what is the "C:0.0%" label on chart.png
? Mine never increases from 0.0%.
Also, any prediction I make results in a small black image.
@AlexeyAB v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 2.000000, iou_loss = 0.000000, total_loss = 2.000000 total_bbox = 16407, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 3, class_loss = 3.000000, iou_loss = 0.000000, total_loss = 3.000000 total_bbox = 16410, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 2.000000, iou_loss = 0.000000, total_loss = 2.000000 total_bbox = 16412, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 2.000000, iou_loss = 0.000000, total_loss = 2.000000 total_bbox = 16414, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 2.000000, iou_loss = 0.000000, total_loss = 2.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 2, class_loss = 2.000000, iou_loss = 0.000000, total_loss = 2.000000 total_bbox = 16418, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1.000000, iou_loss = 0.000000, total_loss = 1.000000 total_bbox = 16419, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = -nan, iou_loss = -nan, total_loss = -nan v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = -nan, iou_loss = -nan, total_loss = -nan v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = -nan, iou_loss = -nan, total_loss = -nan total_bbox = 16419, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1.000000, iou_loss = 0.000000, total_loss = 1.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 5, class_loss = 5.000000, iou_loss = 0.000000, total_loss = 5.000000 total_bbox = 16425, rewritten_bbox = 0.000000 % v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 139 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 150 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 v3 (iou loss, Normalizer: (iou: 0.07, cls: 1.00) Region 161 Avg (IOU: 0.000000, GIOU: 0.000000), Class: nan, Obj: 0.000000, No Obj: 0.000000, .5R: 0.000000, .75R: 0.000000, count: 3, class_loss = 3.000000, iou_loss = 0.000000, total_loss = 3.000000 total_bbox = 16428, rewritten_bbox = 0.000000 %
I have check my datasets in run darknet -show_imgs and results is here
and chart.png is here
So, I have just one class of object that I'm trying to recognize.
I know this is low number of images but I thought it would be OK since the examples are all very similar images, and the object is almost always in the exact same location (top right corner). Let me know if this is an incorrect assumption.
Anyway, I am able to train on these images until average loss is in the vicinity of 0.25-0.75.
Command:
./darknet detector train ./X.data ./X.cfg ./yolov4-tiny.conv.29
I take my generated .weights and run this command to test against a variety of test images:
./darknet detector test ./X.data ./X.cfg ./X.weights ./test.jpg
But I don't see any objects identified on the UI that pops up. What gives? Do I need to tweak my config settings?
Here are my most relevant .cfg sections: