300个epochs试过四五次,但每次都因为各种各样的原因中道崩殂,比如窗口长时间没有活跃(需要js自动点,参考How to prevent Google Colab from disconnecting?
),电脑自动休眠(这个很烦,要把自动休眠关掉,特别是外出时房间没电,最后100-epoch的训练我最后需要外出把电脑寄存在室友房间完成了),GPU使用次数达到上限(最后一次导致我换号训练)。
结果
用时:100 epochs completed in 3.416 hours.
训练结果:
评估结果:
Namespace(augment=False, batch_size=16, cfg='cfg/yolov3-spp-P30.cfg', conf_thres=0.001, data='../P30/P30.data', device='', img_size=416, iou_thres=0.6, save_json=False, single_cls=False, task='test', weights='weights/last.pt')
Using CUDA device0 _CudaDeviceProperties(name='Tesla P4', total_memory=7611MB)
Model Summary: 225 layers, 6.25787e+07 parameters, 6.25787e+07 gradients
Fusing layers...
Model Summary: 152 layers, 6.25519e+07 parameters, 6.25519e+07 gradients
Caching labels (75 found, 0 missing, 0 empty, 0 duplicate, for 75 images): 100% 75/75 [00:00<00:00, 930.21it/s]
Class Images Targets P R mAP@0.5 F1: 100% 5/5 [00:05<00:00, 1.03s/it]
all 75 109 0.691 0.858 0.807 0.765
right_hand 75 50 0.703 0.851 0.799 0.77
left_hand 75 59 0.679 0.864 0.815 0.761
Speed: 12.0/1.7/13.7 ms inference/NMS/total per 416x416 image at batch-size 16
完成了
.data
,train.txt
,validation.txt
,.names
文件之外,在用于YOLOv3训练时,还需要修改配置文件.cfg
,将YOLO层的filters和classes都改变为我们的手部数据集的标准,即2类,21个filters。21=(5+2)*3,5个bbox的固定attributes (objectness score, bx, by, bw, bh),两个class score。实验
即:训练100个epochs,batch-training的batch size为4。
即使用刚刚训练好的权重,测试时图片会resize到416*416。 为什么?
结果
100 epochs completed in 3.416 hours.
Model Summary: 225 layers, 6.25787e+07 parameters, 6.25787e+07 gradients Fusing layers... Model Summary: 152 layers, 6.25519e+07 parameters, 6.25519e+07 gradients Caching labels (75 found, 0 missing, 0 empty, 0 duplicate, for 75 images): 100% 75/75 [00:00<00:00, 930.21it/s] Class Images Targets P R mAP@0.5 F1: 100% 5/5 [00:05<00:00, 1.03s/it] all 75 109 0.691 0.858 0.807 0.765 right_hand 75 50 0.703 0.851 0.799 0.77 left_hand 75 59 0.679 0.864 0.815 0.761 Speed: 12.0/1.7/13.7 ms inference/NMS/total per 416x416 image at batch-size 16