AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.71k stars 7.96k forks source link

Why does the final inference result of training my data set only return the coordinates of the upper left corner of the rectangle, the width and height of the rectangle are both 0, the coordinates are close to the center of the rectangle, and the classification is correct #8799

Closed YFforever2022 closed 3 months ago

YFforever2022 commented 1 year ago

Why does the final inference result of training my data set only return the coordinates of the upper left corner of the rectangle, the width and height of the rectangle are both 0, the coordinates are close to the center of the rectangle, and the classification is correct

[net]

Testing

batch=1

subdivisions=1

Training

batch=64 subdivisions=4 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation=1.5 exposure=1.5 hue=.1

learning_rate=0.00261 burn_in=1000

max_batches=20000 policy=steps steps=1600000,1800000 scales=.1,.1

weights_reject_freq=1001

ema_alpha=0.9998

equidistant_point=1000

num_sigmas_reject_badlabels=3

badlabels_rejection_percentage=0.2

[convolutional] batch_normalize=1 filters=32 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[route] layers=-1 groups=2 group_id=1

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[route] layers = -1,-2

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers = -6,-1

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[route] layers=-1 groups=2 group_id=1

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[route] layers = -1,-2

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[route] layers = -6,-1

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[route] layers=-1 groups=2 group_id=1

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[route] layers = -1,-2

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[route] layers = -6,-1

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

##################################

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=663 activation=linear

[yolo] mask=3,4,5 anchors=42, 33, 42, 33, 42, 33, 42, 33, 42, 33, 42, 33 classes=216 num=6 jitter=.3 scale_x_y=1.05 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou ignore_thresh=.7 truth_thresh=1 random=0 resize=1.5 nms_kind=greedynms beta_nms=0.6

new_coords=1

scale_x_y=2.0

[route] layers = -4

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 23

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=663 activation=linear

[yolo] mask=0,1,2 anchors=42, 33, 42, 33, 42, 33, 42, 33, 42, 33, 42, 33 classes=216 num=6 jitter=.3 scale_x_y=1.05 cls_normalizer=1.0 iou_normalizer=0.07 iou_loss=ciou ignore_thresh=.7 truth_thresh=1 random=0 resize=1.5 nms_kind=greedynms beta_nms=0.6

new_coords=1

scale_x_y=2.0

YFforever2022 commented 1 year ago

darknet.exe detector train data/voc.data yolov4-tiny.cfg yolov4-tiny.conv.29 -gpus 0 -map

YFforever2022 commented 1 year ago

darknet.exe detector calc_anchors data/voc.data -num_of_clusters 6 -width 416 -height 416

YFforever2022 commented 1 year ago

CUDA-version: 11030 (12000), cuDNN: 8.4.1, GPU count: 1 OpenCV version: 4.6.0 0 Prepare additional network for mAP calculation... 0 : compute_capability = 860, cudnn_half = 0, GPU: NVIDIA GeForce RTX 3060 net.optimized_memory = 0 mini_batch = 1, batch = 4, time_steps = 1, train = 0

YFforever2022 commented 1 year ago

t 216 categories, the previous training is normal, but then suddenly the avg loss rises to 9 and it can’t get down. I have tried many times and it is the same, and then the calculated map is 0% or -nan%

YFforever2022 commented 1 year ago

darknet detector test data/voc.data yolov4-tiny.cfg backup/yolov4-tiny_best.weights -i 0 -thresh 0.2 -ext_output 1.bmp

0 : compute_capability = 860, cudnn_half = 0, GPU: NVIDIA GeForce RTX 3060 net.optimized_memory = 0 mini_batch = 1, batch = 1, time_steps = 1, train = 0 layer filters size/strd(dil) input output 0 Create CUDA-stream - 0 Create cudnn-handle 0 conv 32 3 x 3/ 2 416 x 416 x 3 -> 208 x 208 x 32 0.075 BF 1 conv 64 3 x 3/ 2 208 x 208 x 32 -> 104 x 104 x 64 0.399 BF 2 conv 64 3 x 3/ 1 104 x 104 x 64 -> 104 x 104 x 64 0.797 BF 3 route 2 1/2 -> 104 x 104 x 32 4 conv 32 3 x 3/ 1 104 x 104 x 32 -> 104 x 104 x 32 0.199 BF 5 conv 32 3 x 3/ 1 104 x 104 x 32 -> 104 x 104 x 32 0.199 BF 6 route 5 4 -> 104 x 104 x 64 7 conv 64 1 x 1/ 1 104 x 104 x 64 -> 104 x 104 x 64 0.089 BF 8 route 2 7 -> 104 x 104 x 128 9 max 2x 2/ 2 104 x 104 x 128 -> 52 x 52 x 128 0.001 BF 10 conv 128 3 x 3/ 1 52 x 52 x 128 -> 52 x 52 x 128 0.797 BF 11 route 10 1/2 -> 52 x 52 x 64 12 conv 64 3 x 3/ 1 52 x 52 x 64 -> 52 x 52 x 64 0.199 BF 13 conv 64 3 x 3/ 1 52 x 52 x 64 -> 52 x 52 x 64 0.199 BF 14 route 13 12 -> 52 x 52 x 128 15 conv 128 1 x 1/ 1 52 x 52 x 128 -> 52 x 52 x 128 0.089 BF 16 route 10 15 -> 52 x 52 x 256 17 max 2x 2/ 2 52 x 52 x 256 -> 26 x 26 x 256 0.001 BF 18 conv 256 3 x 3/ 1 26 x 26 x 256 -> 26 x 26 x 256 0.797 BF 19 route 18 1/2 -> 26 x 26 x 128 20 conv 128 3 x 3/ 1 26 x 26 x 128 -> 26 x 26 x 128 0.199 BF 21 conv 128 3 x 3/ 1 26 x 26 x 128 -> 26 x 26 x 128 0.199 BF 22 route 21 20 -> 26 x 26 x 256 23 conv 256 1 x 1/ 1 26 x 26 x 256 -> 26 x 26 x 256 0.089 BF 24 route 18 23 -> 26 x 26 x 512 25 max 2x 2/ 2 26 x 26 x 512 -> 13 x 13 x 512 0.000 BF 26 conv 512 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x 512 0.797 BF 27 conv 256 1 x 1/ 1 13 x 13 x 512 -> 13 x 13 x 256 0.044 BF 28 conv 512 3 x 3/ 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BF 29 conv 663 1 x 1/ 1 13 x 13 x 512 -> 13 x 13 x 663 0.115 BF 30 yolo [yolo] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05 nms_kind: greedynms (1), beta = 0.600000 31 route 27 -> 13 x 13 x 256 32 conv 128 1 x 1/ 1 13 x 13 x 256 -> 13 x 13 x 128 0.011 BF 33 upsample 2x 13 x 13 x 128 -> 26 x 26 x 128 34 route 33 23 -> 26 x 26 x 384 35 conv 256 3 x 3/ 1 26 x 26 x 384 -> 26 x 26 x 256 1.196 BF 36 conv 663 1 x 1/ 1 26 x 26 x 256 -> 26 x 26 x 663 0.229 BF 37 yolo [yolo] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05 nms_kind: greedynms (1), beta = 0.600000 Total BFLOPS 7.122 avg_outputs = 328349 Allocate additional workspace_size = 13.80 MB Loading weights from D:/darknetv4/backup/yolov4-tiny_best.weights... seen 64, trained: 96 K-images (1 Kilo-batches_64) Done! Loaded 38 layers from weights-file Detection layer: 30 - type = 28 Detection layer: 37 - type = 28 1.BMP: Predicted in 2806.450000 milli-seconds. a: 50% (left_x: 84 top_y: 287 width: 0 height: 0) b: 93% (left_x: 85 top_y: 408 width: 0 height: 0) c: 81% (left_x: 87 top_y: 225 width: 0 height: 0) a: 90% (left_x: 87 top_y: 289 width: 0 height: 0) d: 91% (left_x: 88 top_y: 348 width: 0 height: 0) e: 59% (left_x: 146 top_y: 408 width: 0 height: 0) f: 88% (left_x: 146 top_y: 226 width: 0 height: 0) g: 93% (left_x: 147 top_y: 289 width: 0 height: 0) h: 91% (left_x: 148 top_y: 350 width: 0 height: 0) i: 35% (left_x: 149 top_y: 167 width: 0 height: 0) g: 44% (left_x: 149 top_y: 288 width: 0 height: 0) e: 91% (left_x: 205 top_y: 408 width: 0 height: 0) j: 49% (left_x: 207 top_y: 288 width: 0 height: 1) j: 63% (left_x: 207 top_y: 289 width: 0 height: 0) k: 90% (left_x: 208 top_y: 348 width: 0 height: 0) l: 69% (left_x: 208 top_y: 167 width: 0 height: 0) c: 24% (left_x: 208 top_y: 167 width: 0 height: 0) m: 72% (left_x: 208 top_y: 227 width: 0 height: 0) e: 85% (left_x: 265 top_y: 407 width: 0 height: 0) n: 73% (left_x: 266 top_y: 165 width: 0 height: 0) o: 81% (left_x: 266 top_y: 227 width: 0 height: 0) p: 49% (left_x: 267 top_y: 288 width: 0 height: 0) p: 55% (left_x: 267 top_y: 288 width: 0 height: 0) Failed to calloc 17179869184.0 GiB

YFforever2022 commented 1 year ago

Win10

Sat May 20 14:16:12 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 528.02 Driver Version: 528.02 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... WDDM | 00000000:03:00.0 On | N/A | | 82% 83C P2 149W / 170W | 10881MiB / 12288MiB | 95% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================|

YFforever2022 commented 1 year ago

I have checked that everything is normal for my dataset labeling, and I can see normal labeling images using the command -show_imgs

YFforever2022 commented 1 year ago

Regardless of whether I use the default anchors or calculate it through the data set, the detection model cannot be trained normally in the end.

CUDA-version: 11030 (12000), cuDNN: 8.4.1, GPU count: 1 OpenCV version: 4.6.0

num_of_clusters = 6, width = 416, height = 416 read labels from 1160 images loaded image: 1160 box: 23991 all loaded.

calculating k-means++ ...

iterations = 0

counters_per_class = 212, 219, 182, 263, 135, 113, 122, 124, 104, 161, 165, 160, 142, 163, 164, 140, 128, 133, 277, 228, 130, 169, 135, 131, 134, 129, 120, 153, 131, 134, 329, 201, 176, 165, 169, 156, 156, 141, 134, 153, 133, 124, 158, 139, 170, 140, 134, 132, 141, 122, 112, 141, 144, 140, 376, 178, 168, 170, 179, 171, 161, 162, 173, 147, 128, 131, 67, 84, 69, 150, 143, 146, 70, 68, 78, 158, 136, 142, 125, 207, 155, 141, 192, 147, 26, 24, 6, 25, 33, 7, 39, 1, 13, 18, 3, 4, 15, 17, 9, 25, 11, 17, 6, 16, 4, 45, 21, 5, 8, 3, 6, 4, 14, 3, 28, 29, 23, 31, 13, 39, 50, 12, 12, 10, 19, 9, 12, 16, 22, 4, 119, 112, 115, 143, 166, 222, 135, 120, 149, 129, 136, 135, 284, 211, 272, 139, 136, 135, 161, 124, 177, 135, 128, 135, 56, 29, 15, 53, 12, 134, 150, 85, 22, 93, 153, 172, 8, 94, 4, 5, 420, 9, 14, 92, 9, 237, 6, 194, 18, 96, 3, 58, 5, 11, 19, 135, 16, 74, 174, 38, 9, 35, 360, 193, 212, 173, 162, 170, 152, 137, 160, 153, 127, 134, 149, 138, 201, 161, 132, 130, 146, 119, 138, 150, 138, 191

avg IoU = 100.00 %

Saving anchors to the file: anchors.txt anchors = 42, 33, 42, 33, 42, 33, 42, 33, 42, 33, 42, 33

YFforever2022 commented 1 year ago

I have tried many times, one of the models iterates 8000 times, the detection effect is very good, but only left_x and top_y, width and height are 0, I don't know where the problem is

YFforever2022 commented 1 year ago

One of the annotation files in the data set contains

The width and height of each object frame are 40x40, and the size of each picture is 400*500

133 0.211667 0.349333 0.1 0.08 133 0.361667 0.349333 0.1 0.08 151 0.511667 0.349333 0.1 0.08 151 0.661667 0.349333 0.1 0.08 151 0.211667 0.469333 0.1 0.08 151 0.361667 0.469333 0.1 0.08 151 0.511667 0.469333 0.1 0.08 136 0.661667 0.469333 0.1 0.08 136 0.211667 0.589333 0.1 0.08 136 0.361667 0.589333 0.1 0.08 136 0.511667 0.589333 0.1 0.08 136 0.661667 0.589333 0.1 0.08 145 0.211667 0.712 0.1 0.08 145 0.361667 0.712 0.1 0.08 145 0.511667 0.712 0.1 0.08 145 0.661667 0.712 0.1 0.08 145 0.211667 0.832 0.1 0.08 139 0.361667 0.832 0.1 0.08 139 0.511667 0.832 0.1 0.08 139 0.661667 0.832 0.1 0.08

YFforever2022 commented 1 year ago

I haven't figured out why sometimes avg loss suddenly rises from a low value to a relatively high value, and then it doesn't decrease, sometimes it doesn't, but the map is very stable and always 0

YFforever2022 commented 1 year ago

for conf_thresh = 0.25, precision = -nan(ind), recall = 0.00, F1-score = -nan(ind) for conf_thresh = 0.25, TP = 0, FP = 0, FN = 23991, average IoU = 0.00 %

IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = -nan, or -nan % Total Detection Time: 21 Seconds

Set -points flag: -points 101 for MS COCO -points 11 for PascalVOC 2007 (uncomment difficult in voc.data) -points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

mean_average_precision (mAP@0.50) = -nan

YFforever2022 commented 1 year ago

It can be seen that the right side of the picture is a model that is trained normally and can be predicted normally. The IOU during training will not be 0, while the left side is the abnormal training IOU of 0. ONE

YFforever2022 commented 1 year ago

It can be seen that the avg IOU appeared 0 at the beginning of training w

YFforever2022 commented 1 year ago

Found that Region 30 Avg (IOU: 0.000000) is often 0