Open alexis-gruet-deel opened 4 years ago
datasets train/val are small or worng
Hi and thanks for answering. Dataset is ~15K images w/ 3K for validation, is not enough ? Wrong means high bias or something else?
-show_imgs
i.e. ./darknet detector train ... -show_imgs
and look at the aug_...jpg
images, do you see correct truth bounded boxes?bad.list
and bad_label.list
if they existWhat is bad.list
and bad_label.list
used for? I didn't have these files when I was training
Hi, through remote X no way to make -show_imgs works because of [xcb] Unknown request in queue while dequeuing
see below :
[...]
158 conv 512 1 x 1/ 1 10 x 10 x1024 -> 10 x 10 x 512 0.105 BF
159 conv 1024 3 x 3/ 1 10 x 10 x 512 -> 10 x 10 x1024 0.944 BF
160 conv 24 1 x 1/ 1 10 x 10 x1024 -> 10 x 10 x 24 0.005 BF
161 yolo
[yolo] params: iou loss: ciou (4), iou_norm: 0.07, cls_norm: 1.00, scale_x_y: 1.05
nms_kind: greedynms (1), beta = 0.600000
Total BFLOPS 35.253
avg_outputs = 289965
Allocate additional workspace_size = 131.24 MB
Loading weights from backup/yolov4-hails_last.weights...
seen 64, trained: 742 K-images (11 Kilo-batches_64)
Done! Loaded 162 layers from weights-file
Learning Rate: 0.001, Momentum: 0.949, Decay: 0.0005
If error occurs - run training with flag: -dont_show
Resizing, random_coef = 1.40
480 x 480
Create 6 permanent cpu-threads
[xcb] Unknown request in queue while dequeuing
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
darknet: ../../src/xcb_io.c:165: dequeue_pending_request: Assertion `!xcb_xlib_unknown_req_in_deq' failed.
Aborted (core dumped)
no files called bad.list
or bad_label.list
I do confirm aug_* data are with the correct bbox, as follow :
cfg file attached : yolov4-hailnet.txt
What is
bad.list
andbad_label.list
used for? I didn't have these files when I was training
I guess those files are created if a file as part of the train/val set is missing or may have corrupted label(s) ; better to ask @AlexeyAB
I guess those files are created if a file as part of the train/val set is missing or may have corrupted label(s) ;
Yes. If you don't have these files - then all is ok.
You anchors/masks are wrong. Train with default anchors. https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
Only if you are an expert in neural detection networks - recalculate anchors for your dataset for width and height from cfg-file: darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 then set the same 9 anchors in each of 3 [yolo]-layers in your cfg-file. But you should change indexes of anchors masks= for each [yolo]-layer, so for YOLOv4 the 1st-[yolo]-layer has anchors smaller than 30x30, 2nd smaller than 60x60, 3rd remaining, and vice versa for YOLOv3. Also you should change the filters=(classes + 5)*
before each [yolo]-layer. If many of the calculated anchors do not fit under the appropriate layers - then just try using all the default anchors.
I generated those anchors from the darknet cmd. They were calculated from my dataset, did you seen something wrong in that ?
you should change indexes of anchors masks= for each [yolo]-layer
Hello, Don't really know if this is expected.
Any advices are welcome.