I am trying to train Tiny Yolov3 with the addition of a gru layer. However, I do not see any results after the training process. Please find below my modifications to tiny-yolov3 config file
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.000000, or 0.00 %
Total Detection Time: 100 Seconds
Set -points flag:
-points 101 for MS COCO
-points 11 for PascalVOC 2007 (uncomment difficult in voc.data)
-points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset`
I am trying to train Tiny Yolov3 with the addition of a gru layer. However, I do not see any results after the training process. Please find below my modifications to tiny-yolov3 config file
`[net] batch=64 subdivisions=64
width=416 height=416 channels=1 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
learning_rate=0.001 burn_in=1000 max_batches = 5000 policy=steps steps=3000,4000 scales=.1,.1
Layer 0
[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky
Layer 1
[maxpool] size=2 stride=2
Layer 2
[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky
Layer 3
[maxpool] size=2 stride=2
Layer 4
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
Layer 5
[maxpool] size=2 stride=2
Layer 6
[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky
Layer 7
[maxpool] size=2 stride=2
Layer 8
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
Layer 9
[maxpool] size=2 stride=2
Layer 10
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
Layer 11
[maxpool] size=2 stride=1
Layer 12
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
Layer 13 (1x1 CONVOLUTION)
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky
Layer 14 (GRU)
[gru] batch_normalize=1 output = 256
[connected] output=256 activation=linear
Layer 15
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
Layer 16
[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear
Layer 17 (YOLO)
[yolo] mask = 3,4,5 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1
Layer 18
[route] layers = -4
Layer 19
[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky
Layer 20 (UPSAMPLE)
[upsample] stride=2
Layer 21 (ROUTE)
[route] layers = -4
Layer 22
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
Layer 23
[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear
Layer 24
[yolo] mask = 0,1,2 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1
random=0`
This is the result I get when I check for mAP
`CUDA-version: 11080 (12000), cuDNN: 8.9.6, CUDNN_HALF=1, GPU count: 1
CUDNN_HALF=1 OpenCV version: 4.5.4 0 : compute_capability = 700, cudnn_half = 1, GPU: Tesla V100-SXM2-16GB net.optimized_memory = 0 mini_batch = 1, batch = 64, time_steps = 1, train = 0 layer filters size/strd(dil) input output 0 Create CUDA-stream - 0 Create cudnn-handle 0 conv 16 3 x 3/ 1 416 x 416 x 1 -> 416 x 416 x 16 0.050 BF 1 max 2x 2/ 2 416 x 416 x 16 -> 208 x 208 x 16 0.003 BF 2 conv 32 3 x 3/ 1 208 x 208 x 16 -> 208 x 208 x 32 0.399 BF 3 max 2x 2/ 2 208 x 208 x 32 -> 104 x 104 x 32 0.001 BF 4 conv 64 3 x 3/ 1 104 x 104 x 32 -> 104 x 104 x 64 0.399 BF 5 max 2x 2/ 2 104 x 104 x 64 -> 52 x 52 x 64 0.001 BF 6 conv 128 3 x 3/ 1 52 x 52 x 64 -> 52 x 52 x 128 0.399 BF 7 max 2x 2/ 2 52 x 52 x 128 -> 26 x 26 x 128 0.000 BF 8 conv 256 3 x 3/ 1 26 x 26 x 128 -> 26 x 26 x 256 0.399 BF 9 max 2x 2/ 2 26 x 26 x 256 -> 13 x 13 x 256 0.000 BF 10 conv 512 3 x 3/ 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BF 11 max 2x 2/ 1 13 x 13 x 512 -> 13 x 13 x 512 0.000 BF 12 conv 1024 3 x 3/ 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF 13 conv 256 1 x 1/ 1 13 x 13 x1024 -> 13 x 13 x 256 0.089 BF 14 GRU Layer: 43264 inputs, 256 outputs connected 43264 -> 256 connected 256 -> 256 connected 43264 -> 256 connected 256 -> 256 connected 43264 -> 256 connected 256 -> 256 15 connected 256 -> 256 16 conv 512 3 x 3/ 1 1 x 1 x 256 -> 1 x 1 x 512 0.002 BF 17 conv 18 1 x 1/ 1 1 x 1 x 512 -> 1 x 1 x 18 0.000 BF 18 yolo [yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00 19 route 15 -> 1 x 1 x 256 20 conv 128 1 x 1/ 1 1 x 1 x 256 -> 1 x 1 x 128 0.000 BF 21 upsample 2x 1 x 1 x 128 -> 2 x 2 x 128 22 route 18 -> 1 x 1 x 18 23 conv 256 3 x 3/ 1 1 x 1 x 18 -> 1 x 1 x 256 0.000 BF 24 conv 18 1 x 1/ 1 1 x 1 x 256 -> 1 x 1 x 18 0.000 BF 25 yolo [yolo] params: iou loss: mse (2), iou_norm: 0.75, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.00 Total BFLOPS 3.735 avg_outputs = 506806 Allocate additional workspace_size = 52.44 MB Loading weights from /content/drive/MyDrive/Customv3/backup/GR-YoloV3_final.weights... seen 64, trained: 320 K-images (5 Kilo-batches_64) Done! Loaded 26 layers from weights-file
calculation mAP (mean average precision)... Detection layer: 18 - type = 28 Detection layer: 25 - type = 28 392 detections_count = 0, unique_truth_count = 464
class_id = 0, name = Face, ap = 0.00% (TP = 0, FP = 0)
for conf_thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan for conf_thresh = 0.25, TP = 0, FP = 0, FN = 464, average IoU = 0.00 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.000000, or 0.00 % Total Detection Time: 100 Seconds
Set -points flag:
-points 101
for MS COCO-points 11
for PascalVOC 2007 (uncommentdifficult
in voc.data)-points 0
(AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset`